When it was introduced in September 2019, Google’s ALBERT language model achieved SOTA results on popular natural language understanding (NLU) benchmarks like GLUE, RACE, and SQuAD 2.0. Google has now released a major V2 ALBERT update and open-sourced Chinese ALBERT models.
ALBERT — as the full name “A Lite BERT” suggests — is a trimmed-down version of the company’s BERT (Bidirectional Encoder Representations from Transformers) language representation model which has become a mainstay for NLU research. The paper ALBERT: A Lite BERT for Self-supervised Learning of Language Representations has been accepted at ICLR 2020, which will be held this April in the Ethiopian capital Addis Ababa.
As outlined in the Synced report Google’s ALBERT Is a Leaner BERT; Achieves SOTA on 3 NLP Benchmarks, an ALBERT configuration similar to BERT-large has 18x fewer parameters and can be trained about 1.7x faster.
Major changes in the ALBERT v2 models involve three novel strategies: no dropout, additional training data and long training time. Researchers trained the ALBERT-base for 10M steps and the other models for 3M steps. The results show ALBERT v2 performance generally has a significantly improvement over the first version.
Exceptionally, ALBERT-xxlarge v2 performance is slightly worse than the first version. The researchers identify two probable causes for this: 1. Training an additional 1.5 M steps did not lead to significant performance improvement; 2. For v1, researchers did some hyperparameter search among the parameters sets while for v2 they adopted the parameters from v1 but fine-tuned the RACE test hyperparameters. “Given that the downstream tasks are sensitive to the fine-tuning hyperparameters, we should be careful about so called slight improvements.”
Google has also released Chinese-language ALBERT models built using training data from the Language Understanding Evaluation benchmark for Chinese (CLUE).
The paper ALBERT: A Lite BERT for Self-supervised Learning of Language Representations is on arXiv. The ALBERT models v2 GitHub page is here.
Author: Yuqing Li | Editor: Michael Sarazen