Contemporary large language models (LLMs) like OpenAI’s GPT-4 have dramatically advanced generative AI — but are such models truly intelligent, and are they on the right path to machine learning’s ultimate goal of artificial general intelligence (AGI)?
A Microsoft Research team investigates these questions in the new paper Sparks of Artificial General Intelligence: Early Experiments with GPT-4. The study demonstrates GPT-4’s ability to achieve human-level performance on novel and difficult tasks in domains ranging from mathematics and coding to vision, medicine, law and psychology; and concludes that it “could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system.”
Departing from machine learning’s benchmark-based evaluations, the team borrows from traditional psychological approaches that leverage human creativity and curiosity to characterize GPT-4’s general intelligence capabilities.
The researchers probe GPT-4’s responses and behaviours to evaluate its consistency, coherence and correctness; and to reveal its limitations and biases. One of their first conclusions is that GPT-4 produces much more impressive results than ChatGPT. Like other LLMs, GPT-4 achieves impressive performance on generating and manipulating images and audio/music and on solving math and coding problems, demonstrating strong generative, interpretive, compositional, and spatial skills.
The team further tests GPT-4’s integrative ability, i.e. its performance on combining capabilities and knowledge in different domains and on multi-modal information processing. The results show that GPT-4 not only learns the general principles and patterns of different domains but also synthesizes them in creative and novel ways.
GPT-4 is also shown to be capable of performing tasks that require an understanding of both the environment and humans; making distinctions between different stimuli, concepts, and situations; and determining similarity between statements — skills that represent a huge step toward AGI.
The team notes that despite its powerful performance, the autoregressive architecture of GPT-4 also brings limitations, such as a lack of planning in arithmetic/reasoning problems and text generation. Moreover, they caution that such models could have negative impacts on society due to possible inherent biases and the generation of erroneous information. These issues, they say, are challenging and will require additional studies to solve.
Overall, this work details the breadth and depth of GPT-4, and proposes recognizing it as a nascent AGI system.
The paper Sparks of Artificial General Intelligence: Early Experiments with GPT-4 is on arXiv.
Author: Hecate He | Editor: Michael Sarazen
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.
Pingback: Microsoft Does a Deep Dive on GPT-4, Finds “Sparks of AGI” - GPT AI News
Pingback: Microsoft Does a Deep Dive on GPT-4, Finds “Sparks of AGI” | GPT AI News