Facebook Boosts Cross-Lingual Language Model Pretraining Performance

Facebook researchers have introduced two new methods for pretraining cross-lingual language models (XLMs). The unsupervised method uses monolingual data, while the supervised version leverages parallel data with a new cross-lingual language model. The research aims at building an efficient cross-lingual encoder for sentences in different languages within the same embedded space — a shared-coding-space approach that provides advantages for tasks such as machine translation.

Research results show advanced efficiency in various cross-language comprehension tasks and state-of-the-art results on cross-lingual classification, unsupervised and supervised machine translation.

The Facebook XLM project contains code for:

Language model pretraining:
- Causal Language Model (CLM) – monolingual
- Masked Language Model (MLM) – monolingual
- Translation Language Model (TLM) – cross-lingual
Supervised / Unsupervised MT training:
- Denoising auto-encoder
- Parallel data training
- Online back-translation
XNLI fine-tuning
GLUE fine-tuning

XLM also supports multi-GPU and multi-node training.

Generating cross-lingual sentence representations

The project provides sample code that can quickly obtain cross-language sentence representations from pretrained models. These cross-lingual sentence representations are useful for machine translation, calculating sentence similarities, or implementing cross-lingual language classifiers. The examples provided by the project are mainly written in Python 3, and require support from the Numpy, PyTorch, fastBPE, and Moses libraries.

To generate cross-language sentence representations, the first step is to import code files and libraries and load the pre-training model:

image (1).png

Next, build a dictionary, update parameters, and build a model:

The following is a list of cases in BPE format (based on the fastBPE library), where researchers extracted sentence representations based on the pretraining model:

The last step is creating a batch and completing forward propagation to produce the final sentence embedding vector:

The final output tensor shape (sequence_length, batch_size, model_dimension) can be further fine-tuned to complete 11 NLP tasks or XNLI tasks in GLUE.

Researchers report their unsupervised method achieved a score of 34.3 BLEU on WMT’16 German-English, bettering the previous best approach by more than 9 BLEU. On supervised machine translation, performance rose by more than 4 BLEU points on the WMT’16 Romanian-English to establish a new state-of-the-art score of 38.5 BLEU.

The paper Cross-lingual Language Model Pretraining is on arXiv.

Author: Victor Lu | Editor: Michael Sarazen

2 comments on “Facebook Boosts Cross-Lingual Language Model Pretraining Performance”

Bagyatech

2020-10-27

Excellent stuff from the article summarizing the most important points. Thanks for sharing

Loading...

gameplay online

2025-07-24

Very Informative article. Its give me a lot of infromation.

Loading...

Facebook Boosts Cross-Lingual Language Model Pretraining Performance

Like this:

2 comments on “Facebook Boosts Cross-Lingual Language Model Pretraining Performance”

Leave a Reply Cancel reply

Related

Share this:

Like this:

2 comments on “Facebook Boosts Cross-Lingual Language Model Pretraining Performance”

Leave a Reply Cancel reply

Related