A Google Research team further explores the scaling approach for improving language modelling, leveraging the new Pathways distributed ML system to train a 540 billion parameter autoregressive transformer, Pathways Language Model (PaLM), that achieves state-of-the-art few-shot performance.
A research team from the University of Washington, Facebook AI Research and the Allen Institute for AI introduces Meta-training for InContext Learning (MetaICL), a new meta-training framework for few-shot learning where an LM is meta-trained to learn in-context — conditioning on training examples to recover the task and make predictions.
A research team from Baidu proposes ERNIE 3.0, a unified framework for pretraining large-scale, knowledge-enhanced models that can easily be tailored for both natural language understanding and generation tasks with zero-shot learning, few-shot learning or fine-tuning, and achieves state-of-the-art results on NLP tasks.
A research team from New York University, Facebook AI, and a CIFAR Fellow in Learning in Machines & Brains raise doubts regarding large-scale pretrained language models’ few-shot learning abilities. The researchers re-evaluate such abilities with held-out examples unavailable, which they propose constitutes “true few-shot learning.”