AI Machine Learning & Data Science Popular Research

DeepMind’s AlphaFold2 Predicts Protein Structures with Atomic-Level Accuracy

In a new paper published in the prestigious scientific journal Nature, DeepMind presents AlphaFold2, a redesigned neural-network system based on last year’s AlphaFold that can predict protein structures with atomic-level accuracy.

The prediction of protein structures from amino acid sequence information alone, known as the “protein folding problem,” has been an important open research question for more than 50 years. In the fall of 2020, DeepMind’s neural network model AlphaFold took a huge leap forward in solving this problem, outperforming some 100 other teams in the Critical Assessment of Structure Prediction (CASP) challenge, regarded as the gold-standard accuracy assessment for protein structure prediction. The success of the novel approach is considered a milestone in protein structure prediction.

This week, the DeepMind paper Highly Accurate Protein Structure Prediction with AlphaFold was published in the prestigious scientific journal Nature. The paper introduces AlphaFold2, a completely redesigned and open-sourced model that can predict protein structures with atomic-level accuracy.


Although machine learning researchers have long sought to develop computational methods for predicting 3-D protein structures from protein sequences, there had been limited progress along this path, chiefly due to the computational intractability of molecular simulation, the context-dependence of protein stability, and the difficulty of producing sufficiently accurate models for protein physics.

In this work, the DeepMind team introduces the first computational approach capable of predicting protein structures to near experimental accuracy. The proposed AlphaFold2 model achieved “outstanding” results in the recent CASP14 assessment.


AlphaFold2’s achievements are based on neural network architectures that jointly embed multiple sequence alignments (MSAs) and pairwise features. The AlphaFold network can directly predict the 3-D coordinates of all heavy atoms for a given protein using the primary amino acid sequence and aligned sequences of homologues as inputs. The network consists of two large modules: Evoformer and a Structure Prediction Module.

Evoformer views protein structure prediction as a graph inference problem, representing the data as a graph in which the nodes represent as amino-acid pairs and the edges as the proximity of those pairs to one another in the protein. By applying deep learning techniques, Evoformer gradually refines a forecast for what the backbone of the protein should look like, then passes the prediction results to the Structure Prediction Module.

The Structure Prediction Module performs a series of geometric transformations to further refine the protein’s shape for greater accuracy. This module’s abstract 3D protein images appear as twisted, ribbonlike curlicues that branch off from the main protein backbone.


As described at the CASP14 conference, AlphaFold2’s methodological advances include: 1) Starting from multiple sequence alignments (MSAs) rather than from more processed features such as inverse covariance matrices derived from MSAs, 2) Replacement of 2D convolution with an attention mechanism that better represents interactions between residues distant along the sequence, 3) Use of a two-track network architecture in which information at the 1D sequence level and the 2D distance map level is iteratively transformed and passed back and forth, 4) Use of an SE(3)-equivariant transformer network to directly refine atomic coordinates (rather than 2D distance maps as in previous approaches) generated from the two-track network, and 5) End-to-end learning in which all network parameters are optimized by backpropagation from the final generated 3D coordinates through all network layers back to the input sequence.

AlphaFold has now clearly demonstrated its effectiveness in this important and rapidly evolving research field, and DeepMind believes the model and associated computational approaches that apply its techniques for other biophysical problems could soon become essential tools in cutting-edge biology research.

The AlphaFold2 code is available on the project Github. The paper Highly Accurate Protein Structure Prediction with AlphaFold is on Nature.

Author: Hecate He | Editor: Michael Sarazen, Chain Zhang

We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

%d bloggers like this: