A new study suggests DeepMind’s amazing game-playing algorithm AlphaZero could help unlock the power and potential of quantum computing.
Since it appeared just over two years ago, AlphaZero has repeatedly proven its fast-learning capabilities, elevating itself to grandmaster level in Go, chess, and shogi (Japanese chess). Traditional game engines such as IBM’s groundbreaking Deep Blue from the 1990s and current world computer chess champion Stockfish rely on heuristics handcrafted by human players. AlphaZero takes a very different approach — provided only the basic rules, it hones its skills through millions of games of self-play in a reinforcement learning environment.
But AlphaZero is about much more than games — its success demonstrates that a single algorithm can learn how to discover new knowledge across a range of scenarios, writes DeepMind’s David Silver in a blog post. This is key to creating general-purpose systems to pursue artificial general intelligence (AGI): “We need them to be flexible and generalise to new situations.”
Echoing DeepMind’s vision, a research team from Denmark’s Aarhus University (AU) has further demonstrated AlphaZero’s broad applicability by applying it to three different control problems that could potentially be used in a quantum computer. The research is presented in a paper recently published in Nature scientific journal NPJ Quantum Information.
Much of quantum computing’s potential lies in its ability to achieve what classical computers cannot — solving optimization problems by computing all possibilities at the same time. While a large number of algorithms have been developed for optimizing quantum dynamics, a common limitation is their reliance on good initial guesses.
The AU researchers reckoned AlphaZero’s game-proven self-learning capabilities could enable it to systematically bypass that limitation. They decided to implement the algorithm from scratch and investigate how it performed on quantum computer optimization problems, paper coauthor and AU Professor Jacob Sherson told Synced in an email.
The paper’s first author, PhD student Mogens Dalgaard explains: “When we analyzed the data from AlphaZero we saw that the algorithm had learned to exploit an underlying symmetry of the problem that we did not originally consider. That was an amazing experience.”
AlphaZero’s success derives from a combination of traditional Monte-Carlo Tree Search (MCTS) and a one-step lookahead deep neural network (DNN). The lookahead information from far down the tree can increase the trained DNN’s precision to produce more focused and heuristic-free exploration.
When applied to quantum computing, AlphaZero achieves substantial improvements in both the quality and quantity of good solution clusters compared to earlier methods. “It is able to spontaneously learn unexpected hidden structure and global symmetry in the solutions, going beyond even human heuristics,” the researchers explain.
The team found that the system achieved the best results when they combined AlphaZero’s algorithm with a specialized quantum optimization algorithm. “This is very interesting because it points to a future where the off the shelf AI algorithms do not simply take over and dominate the special domains but that the domain specialists, in this case us physicists, can interpret strengths and weaknesses of the general approaches and augment them with our detailed knowledge and methods,” Sherson wrote.
Sherson says that within a few hours after the project code was open-sourced “I was contacted by major tech-companies with quantum laboratories and international leading universities to establish future collaboration. So we hope that our work will soon be put to use in practice.”
The paper Global Optimization of Quantum Dynamics with AlphaZero Deep Exploration is available on Nature.
Journalist: Yuan Yuan | Editor: Michael Sarazen