Despite the impressive achievements of deep neural networks (DNNs) in recent years, researchers generally believe these models are still not “intelligent” enough to solve advanced mathematical problems in calculus, differential equations, linear algebra, etc.
In the new paper A Neural Network Solves and Generates Mathematics Problems by Program Synthesis: Calculus, Differential Equations, Linear Algebra, and More, a research team from MIT, Columbia University, Harvard University and the University of Waterloo challenges this assumption, proposing a neural network that can solve university-level mathematics problems by turning questions into programming tasks, i.e. program synthesis.
The team says theirs is the first demonstration of a neural network capable of solving university-level mathematics problems, which it does by combining two recent innovations:
- Neural networks pretrained on text and fine-tuned on code, rather than pretrained on text alone.
- Novel techniques that automatically augment problems with context so neural networks can synthesize correct executable programs.
The proposed neural networks can solve a wide range of advanced mathematics problems taken from MIT mathematics courses such as Single and Multi-variable Calculus, Differential Equations, Probability and Statistics, Linear Algebra, and Mathematics for Computer Science. The researchers randomly selected 25 questions from each of these courses, turned these course questions into programming tasks using Open AI’s Codex (Chen et al., 2021), then ran the programs to solve the problems.
The paper includes a number of examples demonstrating how Codex can be used to turn input examples into programming tasks that produce proper Codex outputs. The main areas are topic context, augmenting topic content for differential equation questions; library context, using the SymPy and StreamPlot Python libraries for solving questions and plotting visualizations; definition context and probabilistic programming, which can turn a probability and statistics question into a probabilistic programming task of simulation generation; using interaction to produce multiple plots; and using question simplification to rephrase long or wordy questions into concise prompts and a series of shorter questions.
Although the paper does not reveal technical details of the proposed DNN, it provides rich empirical results demonstrating that transformers pretrained on text and fine-tuned on code can achieve perfect performance on questions from university-level mathematics courses. The team also shows that prompt generation methods can enable transformers to generate question-solving programs for math subjects, including solutions with plots.
The team conducted a student survey to evaluate the quality and difficulty of their machine-generated questions compared to human-written questions, with the following findings:
- The machine-generated questions were rated slightly more difficult than human-written questions, though within the confidence intervals.
- The human-written questions were rated slightly more appropriate for the courses than the machine-generated questions.
- The human-written questions were rated slightly more likely to be human-written, while the machine-generated questions were rated as equally likely to be machine-generated or human-written.
Overall, this work shows that transformers pretrained on text and fine-tuned on code can automatically solve, grade, and generate university-level mathematics course questions in real-time using program synthesis. The team believes this presents an opportunity for addressing major pedagogical challenges and could bring benefits such as automatic evaluation and content generation to higher education.
The paper A Neural Network Solves and Generates Mathematics Problems by Program Synthesis: Calculus, Differential Equations, Linear Algebra, and More is on arXiv.
Author: Hecate He | Editor: Michael Sarazen
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.