AI Machine Learning & Data Science Research

Google’s Universal Pretraining Framework Unifies Language Learning Paradigms

In the new paper Unifying Language Learning Paradigms, a Google Research/Brain team proposes a framework for pretraining universal language models that are effective across many different tasks. Their 20B parameter model surpasses 175B GPT-3 on the zero-shot SuperGLUE benchmark and triples the performance of T5-XXL on one-shot summarization tasks.

Generalization is one of the primary goals in contemporary machine learning research and is regarded as a pathway to artificial general intelligence. Although today’s pretrained large language models (LMs) continue to push the state-of-the-art in natural language processing (NLP), most such models target specific problem classes and suffer significant performance drops when applied to new tasks. Is it possible to pretrain language models that will work well across many diverse tasks?

A Google Research/Brain team addresses this question in the new paper Unifying Language Learning Paradigms, proposing UL2, a framework for pretraining universal language models that are effective across many different tasks. Their 20B parameter model surpasses the state-of-the-art 175B GPT-3 on the zero-shot SuperGLUE benchmark and triples the performance of T5-XXL on one-shot summarization tasks.

The UL2 framework aims at building a universally applicable language model that is consistently effective across various types of datasets, tasks, and setups. UL2 is driven by Mixture-of-Denoisers (MoD), a novel pretraining objective that integrates diverse pretraining paradigms to enable a single model to maintain strong performance across different tasks.

MoD employs three main paradigms during pretraining: R-Denoiser, a standard denoiser that is good at acquiring knowledge instead of learning to generate fluent text; S-Denoiser, designed for specific denoising cases where a strict sequential order can be observed for framing input-to-target tasks; and X-Denoiser, which is adopted when the model needs to recover a large part of the input but is only given a small moderated part. A novel mode-switching feature enables dynamic mode switching via discrete prompting, such that the model can switch between the R, S and X denoisers on-demand when learning downstream tasks.

In their empirical study, the team conducted extensive experiments on diverse tasks ranging from supervised to prompt-based in-context few-shot learning. In the evaluations, the proposed UL2 outperformed a T5 baseline by 43.6 percent and GPT-like models by 76.1 percent. The team also scaled UL2 to 20B parameters and ran the model on 50+ NLP tasks, where it achieved state-of-the-art performance on a vast majority of the tasks and setups. In zero/few-shot experiments, UL2 surpassed GPT-3 175B on the zero-shot SuperGLUE benchmark.

Flax-based T5X model checkpoints for the 20B UL2 are available on the project’s GitHub. The paper Unifying Language Learning Paradigms is on arXiv.


Author: Hecate He | Editor: Michael Sarazen


We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

12 comments on “Google’s Universal Pretraining Framework Unifies Language Learning Paradigms

  1. Pingback: Google's Universal Pretraining Framework Unifies Language Learning Paradigms – Synced - AI Caosuo

  2. Great blog! I got a lot of new things here.!!

  3. Running is a popular way to stay healthy and in shape. In fact, it has become one of the most popular activities in the world. But did you know that you can also use running to promote good health? In addition to weight loss and improved heart health, there are many other benefits of running.

  4. In their empirical investigation, the group conducted extensive trials on varied tasks ranging from supervised to prompt-based few-shot learning in context.

  5. I appreciate the insightful information you provided.

  6. jacksonalbert

    Get Varsity Jackets introduces the exclusive OVO ESPN 2024 NBA Finals Jacket, a must-have for OVO fans. Perfectly blending style and sports, this limited edition jacket is the ultimate collector’s item. Order now before it’s too late and show your support with this unique piece!

  7. Matthewmark

    Dissertationsproposal.co.uk offers top-notch Company Law Dissertation Topics at unbeatable prices. Whether you need unique ideas or professional guidance, we provide tailored services to meet your academic needs. With our affordable and reliable support, you can excel in your research without breaking the bank. Choose us for the best value in the UK!

  8. The mens brown leather jacket is a timeless piece, and its quality depends on the type of leather used. Full-grain leather is the most durable, top-grain offers a smooth finish, and genuine leather is more affordable. At The Fashion Jackets, we provide high-quality leather options, ensuring you get a jacket that’s both stylish and long-lasting.

  9. Edith Harrison

    Google’s Universal Pretraining Framework is revolutionizing language learning by integrating different paradigms into a unified model. Such advancements in AI-driven education highlight the need for reliable academic support. Seeking quality assignment help UK ensures students receive expert guidance, making complex subjects more accessible and enhancing their overall learning experience.

  10. This article about UL2 is truly eye-opening! Many large language models currently experience significant performance degradation when faced with new tasks, so a general-purpose language model that can be applied to a wide range of tasks sounds fantastic. Furthermore, the fact that a 20 billion parameter model can outperform the 175 billion parameter GPT-3 on certain benchmarks is truly groundbreaking.

  11. This UL2 thing sounds pretty wild. It’s crazy how these language models are still so specialized, even the big ones like GPT-3. The idea of one model that can just handle a bunch of different tasks without losing performance is a huge step if it actually works.

Leave a Reply

Your email address will not be published. Required fields are marked *