The emergence in recent years of large-scale pretrained models such as BERT, DALL-E, GPT-3, etc., has brought a paradigm shift to the AI community. These large-scale models have become ubiquitous in areas such as computer vision, natural language processing (NLP), robotics, inference and search, and continue to “grow wild.”
Large-scale pretrained models have introduced emergent capabilities, and their effectiveness across such a wide range of tasks has incentivized homogenization. Most of today’s state-of-the-art NLP models are derived from a handful of large transformer models. Moreover, the trend is spreading across other fields such as image, speech, protein sequence prediction and reinforcement learning (RL), portending a grand unification of the global AI community.
Needless to say, this homogenization has benefits, as even slight improvements in a large-scale model can quickly engender a large family of new models. There are however also pitfalls, as any flaws in a large-scale model will likely be inherited by downstream models.
While the power of large-scale models is a result of their huge parameter spaces (GPT-3 has 175 billion parameters), this also leads to poor interpretability and uncertainty regarding their capabilities and shortcomings. In this context, is it really wise to blindly shift the entire AI research paradigm to large-scale models?
To explore this crucial question, Percy Liang, Fei-Fei Li, and over 100 other researchers from Stanford University’s Center for Research on Foundation Models (CRFM) have published the 200+ page paper On the Opportunities and Risks of Foundation Models, which systematically describes both the opportunities and risks of such large-scale pretrained “foundation” models. The unique study aims to provide a clearer understanding of how these foundation models work, when and why they fail, and the various capabilities provided by their emergent properties.
The team uses two aspects to describe the significance of foundation models: emergence and homogenization, which, they note, “interact in a potentially unsettling way.” Emergence means that the behaviour of such systems is implicitly induced rather than explicitly constructed; while homogenization refers to the consolidation of methodologies for building machine learning systems across a wide range of applications.
Homogenization could potentially provide enormous gains in domains where task-specific data is limited. The paper notes however that emergence generates substantial uncertainty regarding the capabilities and flaws of foundation models, and as such aggressive homogenization through these models is “risky business.”
From ethical and AI safety perspectives, the “derisking” of foundational models is therefore a central challenge in their further development. The paper stresses the need for caution, and proposes it is time to establish professional norms that will enable responsible research and deployment of foundation models.
Stanford faculty, students, and researchers have formed the Center for Research on Foundational Models (CRFM), a new interdisciplinary program at Stanford HAI (Human-Centered AI). A workshop discussing the opportunities, challenges, limitations, and societal implications of foundational models will be held on August 23-24.
The new paper offers a thorough account of foundation models, covering their capabilities in language, vision, robotics, reasoning and human interaction as well as technical principles such as model architectures, training procedures, data, systems, security, evaluation and theory. It also looks at their applications in fields such as law, healthcare, and education; and associated societal risks such as inequity, misuse, economic and environmental impact, and legal and ethical considerations.
The Stanford team believes their study can serve an important role in orienting and framing dialogue on foundation models, and are encouraging deep interdisciplinary collaboration to ensure their responsible development and deployment.
The paper On the Opportunities and Risks of Foundation Models is on arXiv.
Author: Hecate He | Editor: Michael Sarazen, Chain Zhang
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.