AI Machine Learning & Data Science Research

Back to School: MIT & UWaterloo Model Gets an ‘A’ on ML Course Problems

MIT and University of Waterloo researchers propose a machine learning model that outperforms the average student on problems from MIT’s 6.036 Introduction to Machine Learning course.

Can a machine solve academic machine learning (ML) problems? A research team from MIT and the University of Waterloo says yes, and proves it with an ML model capable of solving problems from MIT’s 6.036 Introduction to Machine Learning course. The proposed model achieved an overall accuracy of 96 percent for open-response questions and 97 percent for multiple-choice questions, bettering the average MIT student score of 93 percent.


The impressive development of contemporary natural language processing (NLP) techniques has endowed ML models with the ability to solve mathematical reasoning questions by predicting the associated equations. But even strong models tend to struggle with problems that involve basic linear algebra and calculus, which are prerequisites for ML.

The team says their study, Solving Machine Learning Problems, is the first to successfully solve such ML problems using ML. Their proposed model can handle problems that involve perceptrons, logistic regression, convolutional neural networks, state machines, reinforcement learning and more. Their model is also the first to approach ML problems using expression trees, and can even automatically generate hints that can be used to help students learn.

The team summarizes the key components contributing to the success of their approach: 1) A dataset of ML questions annotated with expression trees representing their solutions; 2) A data augmentation technique that enables the automatic generation of new and related questions; 3) The use of transformers and graph neural networks to generate expression trees for solving problems, rather than just predicting numerical answers.

The datasets were created based on the 6.036 Intro to ML course exercises, homework and quizzes. The team enhanced the dataset by adding extra expression trees representing how an answer is calculated from information in the question, and augmented each problem with a paraphrasing of the original question text.


The model is fed with questions that require a math expression to solve and returns an answer to the questions. Each input question is encoded by a transformer model and passed into a graph which parses the question into its words and numerical components. The transformer and graph embeddings of the question are passed together into a graph neural network (GNN) to create a new embedding, which is then processed by a tree decoder to generate an expression tree. Finally, the model evaluates the generated expression tree and outputs the corresponding computed value.

The team evaluated their model’s performance using expression accuracy, representing the number of answers with the correct expression; and value accuracy, representing the number of answers with the correct numerical value.


In the experiments, the proposed model achieved an overall average expression accuracy of 95 percent, a value accuracy of 96 percent for open response questions, and an accuracy of 97 percent for multiple-choice questions — a grade-A performance. Moreover, the model performed at superhuman speed, answering each question in mere milliseconds.

This work demonstrates the ability of machines to solve university undergraduate-level ML problems with high accuracy. The researchers believe their model’s abilities in providing methods for solving problems and hint generation could help advance future studies in the area of explainable AI.

The paper Solving Machine Learning Problems is on arXiv.

Author: Hecate He | Editor: Michael Sarazen, Chain Zhang

We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

3 comments on “Back to School: MIT & UWaterloo Model Gets an ‘A’ on ML Course Problems

%d bloggers like this: