If you have ever considered the answer to a question and asked “…but why?” you are not alone. Humans have an innate ability to improve their learning and broaden their understanding via explanations that relate examples to principles. The machine learning community in recent years has witnessed the rapid growth of few-shot prompting language models (LMs) that exhibit impressive transfer learning capability, enabling them to successfully perform new tasks by adapting to a few in-context examples. Might these LMs benefit, as humans do, from explanations of these few-shot examples?
In the new paper Can Language Models Learn From Explanations in Context?, DeepMind researchers investigate how different types of explanations, instructions, and controls affect language models’ zero- and few-shot performance and how such explanations can support in-context learning for large language models on challenging tasks.
The team highlights their main contributions as follows:
- We annotate 40 diverse, challenging language tasks with explanations of examples, and release these annotations.
- We evaluate several LMs after prompting with or without few-shot examples, explanations, instructions, and control conditions.
- Explanations of examples in a few-shot prompt can improve the performance of large models; even without tuning, they outperform matched control conditions.
- Explanations tuned or selected using a small validation set can have larger effects.
- We analyze our results with hierarchical statistical models that respect the dependencies among tasks, items, and prompt elements. We emphasize the broader value of these methods.
The team considered a set of decoder-only transformer models ranging from 1 billion to 280 billion parameters and crafted a variety of control explanations that match different aspects of the semantics and word- or sentence-level content, including scrambled explanations, true non-explanations, and other item explanations.
They tested model performance under each prompt condition on all task dataset items (except those in the prompt) and calculated the model’s likelihood of returning each answer option. They then chose the highest-likelihood answer from the set and evaluated model accuracy based on the answer scores defined by the task.
In their empirical experiments, the team examined the benefits of different prompt components for the largest (280B parameter) LM and the relative distribution of benefits from different explanation types. They also provided raw summaries of the average effects of untuned explanations across model scales.
Their findings can be summarized as follows:
- Including explanations with examples in a few-shot prompt can improve in-context task inference for language models.
- Explanations that are tuned using a validation set are especially effective for the largest models only.
The researchers believe their work can contribute to improved prompt engineering and scientific understanding of the in-context learning abilities of large LMs.
The paper Can Language Models Learn From Explanations in Context? is on arXiv.
Author: Hecate He | Editor: Michael Sarazen
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.