Large Language Models (LLMs) have showcased remarkable proficiency in tackling complex tasks, ranging from quantitative reasoning to understanding natural language. However, their effectiveness in addressing open problems has been limited by a tendency to confabulate or fall short of surpassing existing results.
In a recent paper titled “Mathematical Discoveries from Program Search with Large Language Models,” a collaborative research effort involving Google DeepMind, the University of Wisconsin-Madison, and Université de Lyon introduces FunSearch—a novel approach that elevates LLM-guided evolutionary procedures. FunSearch not only achieves breakthroughs in established open problems but also leads to the discovery of new algorithms.
FunSearch synergistically combines a pre-trained (frozen) Large Language Model with an evaluator, creating a dynamic interplay between creativity and validation. The iterative process involves evolving initial low-scoring programs into high-scoring ones, thereby uncovering new knowledge.
The input to FunSearch comprises a problem specification in the form of an evaluate function, an initial implementation of the function (which may be trivial), and potentially a skeleton. The method builds a prompt by combining programs from a database, favoring high-scoring ones. This prompt is then fed to the pre-trained LLM, resulting in the creation of new programs. Correctly scored programs are stored in the database, forming a feedback loop that enhances the model’s performance. Users can retrieve the highest-scoring programs at any point during the process.
The efficacy of FunSearch lies in several crucial elements. Firstly, the team incorporates the best-performing programs back into prompts, allowing the LLM to build on its successes. Secondly, they initiate the process with a program skeleton, evolving only the segment governing critical program logic while retaining boilerplate code and prior structural information.
FunSearch demonstrates its potential by addressing the cap set problem—a longstanding challenge that has perplexed mathematicians across various research areas for decades. Renowned mathematician Terence Tao once hailed it as his favorite open question. Furthermore, FunSearch outperforms state-of-the-art computational solvers, showcasing its scalability beyond their current capabilities.
In conclusiion, FunSearch stands as a pioneering methodology that not only pushes the boundaries of LLM-guided evolution but also makes significant contributions to mathematical discovery. Its success in solving open problems and surpassing existing computational capabilities underscores its potential as a valuable tool in advancing scientific inquiry.
The paper Mathematical discoveries from program search with large language models on Nature.
Author: Hecate He | Editor: Chain Zhang
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

