Meta AI & UPF’s Toolformer: Enabling Language Models to Teach Themselves to Use External Tools

Large language models (LLMs) have revolutionized machine learning and captivated the general public with their remarkable generative abilities and capabilities for solving complex new tasks using only a few examples or text prompts. It is therefore surprising that these seemingly omniscient LLMs often struggle with basic functionalities such as arithmetic operations or factual lookup.

A team from Meta AI Research and the Universitat Pompeu Fabra addresses this limitation in the new paper Toolformer: Language Models Can Teach Themselves to Use Tools. The team proposes Toolformer, a model that self-learns how to choose and use external tools such as search engines, calculators, and translation systems via API calls to boost its performance on downstream tasks.

The team summarizes Toolformer’s desiderata as follows:

The use of tools should be learned in a self-supervised way without requiring large amounts of human annotations. This is important not only because of the costs associated with such annotations, but also because what humans find useful may be different from what a model finds useful.
The LM should not lose any of its generality and should be able to decide for itself when and how to use which tool. In contrast to existing approaches, this enables a much more comprehensive use of tools that is not tied to specific tasks.

The Toolformer approach is informed by the use of in-context learning techniques (Brown et al., 2020) to generate datasets from scratch. Given a few human-written examples of how to use a particular API, the LLM annotates a large language modelling dataset with potential API calls. A self-supervised loss is used to identify the best API/tool to call for help in future token prediction on a particular task, and the researchers then fine-tune the model on the API calls judged most useful.

The novel approach enables the trained LLM to learn to use a variety of tools and to select which tool to use when and how. Notably, the team represents each API as text sequences, which enables the seamless insertion of API calls into any given text. The approach is thus agnostic of the training dataset, which gives it strong generalization and language modelling abilities.

In their empirical study, the team applied Toolformer to a pretrained 6.7B parameter GPT-J (Wang and Komatsuzaki, 2021) LLM and evaluated it on downstream tasks such as mathematical reasoning and question answering. Toolformer achieved strong zero-shot results in the experiments, outperforming a much larger GPT-3 model and other baselines.

The paper Toolformer: Language Models Can Teach Themselves to Use Tools is on arXiv.

Author: Hecate He | Editor: Michael Sarazen

We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

2 comments on “Meta AI & UPF’s Toolformer: Enabling Language Models to Teach Themselves to Use External Tools”

Pingback: Meta Platforms Inc Unveils AI Language Model LLama To Help Researchers Improve AI Tools - Tech News
Fifos Lilio

2024-11-25

I have to share this amazing find for anyone looking for a great spot for drinks and a good time. Their menu is packed with options, from refreshing mocktails to bold, flavorful cocktails. I went with their take on an old fashioned, and it was hands down one of the best I’ve ever had. My partner ordered a fruity sangria that was so good I had to steal a sip. The service here is top-notch, with bartenders who really know their craft. The atmosphere is so welcoming, making it perfect for both a night out with friends or even solo relaxation at the bar. Don’t just take my word for it—check out their selection for yourself at hoboken bars. You’ll be hooked from the first sip.

Loading...

Meta AI & UPF’s Toolformer: Enabling Language Models to Teach Themselves to Use External Tools

Like this:

2 comments on “Meta AI & UPF’s Toolformer: Enabling Language Models to Teach Themselves to Use External Tools”

Leave a Reply Cancel reply

Related

Share this:

Like this:

2 comments on “Meta AI & UPF’s Toolformer: Enabling Language Models to Teach Themselves to Use External Tools”

Leave a Reply Cancel reply

Related