Machine Learning & Data Science Popular

Facebook’s Flexible ‘RAG’ Language Model Achieves SOTA Results on Open-Domain QA

Researchers introduced retrieval-augmented generation - a hybrid, end-to-end differentiable model that combines an information retrieval component with a seq2seq generator.

Recent developments in large pretrained language models have led to substantial gains in the field of natural language processing (NLP). State-of-the-art approaches such as generative seq2seq transformers leverage a large amount of unlabelled text to build a general model of language understanding before being fine-tuned on specific NLP tasks such as sentiment analysis or question answering (QA). While such models are packed with potential, they also have three major downsides: they cannot easily expand or revise their memory, they can’t straightforwardly provide insight into their predictions, and they may produce occasional “hallucinations.”

To address these issues, researchers from Facebook AI, University College London and New York University recently introduced retrieval-augmented generation (RAG) — a hybrid, end-to-end differentiable model that combines an information retrieval component with a seq2seq generator and can be fine-tuned on knowledge-intensive downstream tasks to achieve state-of-the-art results.


Like standard seq2seq models, RAG takes a sequence as input and outputs a corresponding sequence. But rather than passing the input directly to the generator, RAG instead uses the input to retrieve a set of relevant documents, such as articles from the Wikipedia corpus.

Unlike pretrained models, RAG’s internal knowledge can be revised, expanded, and even altered — and this is its true strength, according to Facebook. Changing what a pretrained language model knows has typically entailed retraining the entire model with new documents. The proposed approach however enables researchers and engineers to efficiently control what RAG knows and doesn’t know without wasting time on whole-model retraining processes.

Retrieval-augmented generation (RAG) overview

RAG models combine pretrained parametric and non-parametric memory. The parametric memory is the pretrained generative seq2seq transformer, while the non-parametric memory is a Wikipedia dense vector index from a pretrained neural retriever. RAG thus has two sources of knowledge: that which seq2seq models store in their parameters and the knowledge stored in the corpus. This setup is designed to combine the flexibility of “closed-book” (parametric-only) approaches with the performance of “open-book” or retrieval-based (non-parametric) approaches to enable RAG to excel at knowledge-intensive Natural Language Generation tasks.

The researchers evaluated RAG in a wide range of knowledge-intensive tasks, including Open-domain Question Answering, Abstractive Question Answering, Jeopardy Question Generation and Fact Verification, each using a single Wikipedia dump as their non-parametric knowledge source.

Open-domain QA test scores
Generation and classification task test scores

For Open-domain Question Answering, the researchers used the popular open-domain QA datasets Natural Questions (NQ), TriviaQA (TQA), WebQuestions (WQ) and CuratedTrec (CT), with standard Exact Match (EM) as the metric and RAG achieving SOTA results on all four open-domain QA tasks. In Abstractive Question Answering, RAG-Sequence outperformed Facebook’s BART on Open MS-MARCO generation by 2.6 Bleu points and 2.6 Rouge-L points. On the Jeopardy question generation task, RAG outperformed BART on the Q-BLEU-1 metric. The results show that RAG has advantages even in purely extractive tasks, which, along with its flexibility, suggests its broad potential.

RAG has been released in the Hugging Face transformer library. With just five lines of code, researchers and engineers can quickly develop and deploy solutions to knowledge-intensive tasks using RAG. The paper Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks is on arXiv.

Analyst: Hecate He | Editor: Michael Sarazen; Yuan Yuan


Synced Report | A Survey of China’s Artificial Intelligence Solutions in Response to the COVID-19 Pandemic — 87 Case Studies from 700+ AI Vendors

This report offers a look at how China has leveraged artificial intelligence technologies in the battle against COVID-19. It is also available on Amazon KindleAlong with this report, we also introduced a database covering additional 1428 artificial intelligence solutions from 12 pandemic scenarios.

Click here to find more reports from us.

AI Weekly.png

We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

3 comments on “Facebook’s Flexible ‘RAG’ Language Model Achieves SOTA Results on Open-Domain QA

  1. Pingback: [R] Facebook’s Flexible ‘RAG’ Language Model Achieves SOTA Results on Open-Domain QA –

  2. Hello, yeah this piece of writing is really nice and I have learned
    lot of things from it concerning blogging. thanks.

  3. Pingback: Summer 2020 Machine learning news – Dream

Leave a Reply

Your email address will not be published.

%d bloggers like this: