The success of GPT-3 and popularity of ChatGPT have dramatically advanced the development of and public interest in large language models (LLM). Although various prompting strategies have proven useful in improving performance, current transformer-based LLMs remain restricted in the computations they can express, as they can only condition on a bounded input string length.
Recent studies have explored equipping LLMs with an external feedback loop, wherein the model’s outputs are processed then passed back as subsequent inputs. This approach raises a question: Is augmenting an LLM with an external feedback loop merely useful, or does doing so fundamentally expand the range of computations that can be performed?
Google Brain and University of Alberta researcher Dale Schuurmans addresses this question in the new paper Memory Augmented Large Language Models are Computationally Universal, in which he demonstrates the computational universality of an LLM augmented with an associative read-write memory.
The study uses the Flan-U-PaLM 540B (Chung et al., 2022) as its LLM. The augmented memory approach does not require modifying the model weights, only a simple stored instruction computer that can be subsequently programmed with a specific set of prompts.
The stored instruction computer connects the LLM to the associative memory and enables an interaction loop between outputs and subsequently processed input prompts to support general computation and convenient programmability. The external associative memory acts as a “dictionary” that maps variable names to values or address locations to values.
A specific “prompt program” is introduced to drive the system to simulate a universal Turing machine (a computer capable of simulating any algorithm or program). Schuurmans explains that proving the simulation’s fidelity reduces to checking a finite set of prompt-result behaviours and verifying that the LLM produces the correct output for each of the finite set of possible input prompt strings.
The study’s verification results on Flan-U-PaLM 540B prove that external memory augmentation can enable universal computational behaviour in a frozen LLM.
The paper Memory Augmented Large Language Models are Computationally Universal is on arXiv.
Author: Hecate He | Editor: Michael Sarazen, Chain Zhang
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.