AI Nature Language Tech Research

Are Multilingual Language Models Fragile? IBM Adversarial Attack Strategies Cut MBERT QA Performance by 85%

An IBM research team proposes four multilingual adversarial attack strategies and attacks seven languages in a zero-shot setting on large multilingual pretrained language models (e.g. MBERT), reducing average performance by up to 85.6 percent.

As large language models continue to achieve state-of-the-art (SOTA) results on question answering (QA) tasks, researchers are raising a few questions of their own concerning the robustness of these models. An IBM team recently conducted a comprehensive analysis of English QA that suggests SOTA models can be disappointingly fragile when presented with adversarially generated data.

Previous attack strategy studies have focused on monolingual QA performance, while attacks on multilingual QA have remained relatively unexplored. The IBM researchers take aim at the latter, applying four novel multilingual adversarial attack strategies against seven languages in a zero-shot setting. Faced with such attacks, the average performance of large multilingual pretrained language models such as MBERT tumbles by at least 20.3 percent and as much as 85.6 percent.


The researchers summarize their main contributions as exposing flaws in multilingual QA systems and providing insights that are not evident in a single language system, specifically:

  1. MBERT is more susceptible to attacks compared to BERT.
  2. MBERT gives priority to finding the answer in certain languages, causing successful attacks even when the adversarial statement is in a different language than the question and context.
  3. MBERT gives priority to the language of the question over the language of the context.
  4. Augmenting the system with machine-translated data helps build a more robust system.

One of the more popular existing QA attack strategies is adding adversarial sentences to distract reading comprehension systems. The new study builds on this approach by converting a question Q into a statement S. The goal is to generate an adversarial S that is semantically similar to Q but can be identified by a human reader as incorrect.


The proposed Q to S conversion process comprises five steps. The researchers first use universal dependency parsing (UDP) and named entity recognition (NER) to preprocess English question inputs. They perform a depth-first search on the parse and mark all parts-of-speech (POS) tokens to identify word patterns, with the patterns “what nn”, “what vb”, “who vb”, “how many”, and “what vb vb” accounting for over 40 percent of the training set. Based on these patterns, in step two the team converts questions into statements that contain a tagged question.

Given a question Q and statement S, in step three the researchers apply four different attack strategies to create various types of adversarial statements designed to confuse the QA system. The four strategies are random answer random question (RARQ), random answer original question (RAOQ), no answer random question (NARQ), and no answer original question (NAOQ). In step four these generated adversarial statements are translated into other languages, and in step five these translated adversarial statements are inserted into context.

The team conducted experiments on multilingual QA with the MBERT pretrained language model and the SQuAD v1.1, MT-SQuAD and MLQA datasets.


The team attacked MBERTQA, a multilingual system trained with English only, and MT-MBERTQA, a multilingual system trained with data from six languages. The results show that both systems were affected by all four attacks. The strongest attack was RAOQ, which caused a 30 mean F1 score point reduction with adversarial Chinese-language statements.

The study demonstrates the effectiveness of the proposed attack strategies, which the team says can be used to help build more robust QA systems.

The paper Are Multilingual BERT Models Robust? A Case Study on Adversarial Attacks for Multilingual Question Answering is on arXiv.

Author: Hecate He | Editor: Michael Sarazen

We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

2 comments on “Are Multilingual Language Models Fragile? IBM Adversarial Attack Strategies Cut MBERT QA Performance by 85%

  1. Pingback: [R] Are Multilingual Language Models Fragile? IBM Adversarial Attack Strategies Cut MBERT QA Performance by 85% – ONEO AI

  2. Pingback: r/artificial - [R] Are Multilingual Language Models Fragile? IBM Adversarial Attack Strategies Cut MBERT QA Performance by 85% - Cyber Bharat

Leave a Reply

Your email address will not be published. Required fields are marked *