In the new paper Progressive Distillation for Dense Retrieval, a research team from Xiamen U and Microsoft Research presents PROD, a progressive distillation method for dense retrieval that achieves state-of-the-art performance on five widely used benchmarks.
In the new paper Tracing Knowledge in Language Models Back to the Training Data, a team from MIT CSAIL and Google Research proposes a benchmark for tracing language models’ assertions to the associated training data, aiming to establish a principled ground truth and mitigate high compute demands for large neural language model training.
An OpenAI research team fine-tunes the GPT-3 pretrained language model to enable it to answer long-form questions by searching and navigating a text-based web browsing environment, achieving retrieval and synthesis improvements and reaching human-level long-form question-answering performance.
Information retrieval (IR) is the activity of retrieving information from a collection of sources stored on computers, based on user queries. IR enjoys a history of one century , and serves as the heart of many ubiquitous applications such as web search, product recommendation, and personal feeds on social networks.