Quantum algorithms for training wide and classical neural networks have become one of the most promising research areas for quantum computer applications. While neural networks have achieved state-of-the-art results across many benchmark tasks, existing quantum neural networks have yet to clearly demonstrate quantum speedups for tasks involving classical datasets. Given deep learning’s ever-rising computational requirements, the use of quantum computers to efficiently train deep neural networks is a research field that could greatly benefit from further exploration.
Motivated by the success of classical deep neural networks (DNNs), a team from the Massachusetts Institute of Technology and Google Quantum AI has proposed a quantum algorithm designed to train such networks in logarithmic time. The team provides compelling evidence of their proposed method’s efficiency on the standard MNIST image dataset.
Although most current algorithms for quantum machine learning are based on methods from linear algebra, neural networks rely on nonlinearity to act as universal approximators. Recent studies have therefore introduced neural tangent kernels (NTKs), which represent overparameterized neural networks as linearized models applied to nonlinear features.
NTKs define a kernel between any pair of data examples, describing the evolution of deep neural networks during their training by gradient descent. In the infinite-width limit, for instance, neural networks using a Gaussian distribution described by this kernel can yield results comparable to networks trained with gradient descent.
The NTK formalism sheds light on the benefits of DNNs by increasing the number of hidden layers. As a neural network is deepened, the NTK matrix becomes more well-conditioned, speeding up the network’s training via gradient descent. While the effective conditioning of the NTK required for convergence by gradient descent corresponds to a necessary condition for an efficient quantum algorithm to train the neural network, existing quantum algorithms often fail in this regard due to stringent theoretical caveats such as matrix sparsity.
The main contribution of this work is a quantum algorithm designed to train wide and deep neural networks under an approximation of the NTK, estimating the trained neural network output with vanishing error as the size of the training set increases. The team provides two different approximations: a sparsified NTK that only permits O(log n) off-diagonal elements to be nonzero in any row or column; and a diagonal NTK that sets all off-diagonal elements of the NTK to zero. In both cases, convergence of the approximation is done by matrix element bounds of the NTK, while the same bounds also directly enable efficient gradient descent.
The team summarizes their proposed full quantum algorithm’s approach re the approximate NTK as: 1) Assume the existence of a quantum random access memory (QRAM) to store and access any necessary quantum states; 2) Use amplitude estimation and median evaluation to evaluate inner products between data example to compute NTK elements; 3) Post-select to prepare the NTK between the test data point and the training set.
The team conducted experiments on the MNIST binary image classification task to evaluate their method.
The experiment results demonstrate that the MNIST image dataset satisfies the necessary conditions for efficient input/output, and validate the proposed quantum algorithm’s ability to achieve an end-to-end exponential speedup over gradient descent.
The paper A Quantum Algorithm for Training Wide and Deep Classical Neural Networks is on arXiv.
Author: Hecate He | Editor: Michael Sarazen, Chain Zhang
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.