Research Talk Review

Richard Sutton: The Future of Artificial Intelligence Belongs to Search and Learning

The future of AI belongs to scalable methods, search and learning; as presented by Richard Sutton in seminars at University of Toronto
Toronto, Canada, University of Toronto Fields Institute, Machine Learning Advances and Applications Seminar Series Thursday, October 27


When mankind finally comes to understand the principles of intelligence and how they can be embodied in machines, it will be the most important discovery of our age, perhaps of any age. In recent years, with the progress in deep learning and other areas, this great scientific prize almost appears to be within our reach. The consequences, benefits, and threats for humanity have become popular topics in the press, at public policy meetings, and at scientific conferences. Is it all hype and fear mongering, or are there genuine scientific advances underlying the current excitement? In this talk, I will try to provide some perspective, informed and undoubtedly biased by my 38 years of research in AI. I seek to contribute to the conversation in two ways: 1) by seeing current developments as part of the longest trend in AI—towards cheaper computation and thus a greater role for search, learning, and all things scalable, and 2) by sketching one possible path to AI, based on prediction and reinforcement learning.


The scalability of AI will be extremely important over next few years. As Moore’s Law states, our computational resources will double every two years. Therefor a good algorithm must be able to scale with the development of hardware. Although it may not be cost effective for researchers to devote their time researching scalability today, it will pay off exponentially in the near future. The future of AI belongs to scalable search and learning.

Summary of each key points


  1. The most well-known applications of AI: AlphaGo, self driving cars, poker, speech and computer vision. Why is it happening now? Is it due to progress in AI algorithms or simply following Moore’s Law?
    1. Moore’s Law plays an important role. Moore’s Law states that the number of transistors in an integrated circuit will double approximately every two years. The long-term exponential improvement in computer hardware is at least half the reason behind the progress in AI. Hardware plays a huge role in the development of the algorithms.
  2. Humanity have deep rooted concerns about AI. To solve or not to solve them, that is the question
    1. One side considers AI unsafe and a threat to humanity, that one day AI would be smarter than humans
    2. On the other side, AI researchers are sometimes too dismissive of these concerns
  3. According to Richard:
    1. A human-level AI will be a profound scientific achievement which will happen by 2030(25% probability ), 2040(40% probability), or never(10% probability)
    2. AI will bring forth the tide of change, and we should be prepared
    3. Fear of AI is overblown and unhelpful. People who fear of AI don’t even know what exactly they are fearing
    4. AI will escape our control if they are smarter than us. It is likely, as AI can be regard as our successor and not slave. Bad successor are the fault of their parents
    5. AI is slow considering the rate of Moore’s Law
    6. The greatest risk comes from people who misused AI

Past: In the long run scalable methods always win

  1. 3 waves of Neural Networks hype:
    1. 50-60 Perceptron, Adaline, only one learnable layer
    2. 80-90 Connectionism, neural network: multiple layer learning via back propagation (SGD)
    3. 2010 Deep Learning: NNs won because their performance scaled with the Moore’s Law, whereas computational methods did not. The best algorithms are essentially the same as in the80s except with faster computers and larger datasets
  2. The best solution comes from the best algorithm and the most powerful computer
    1. To win chess: key is the big, efficient, heuristic search
    2. To win Go: key is the big, sample-based search
    3. To understanding natural language: keys are statistical machine learning methods and big datasets
    4. To visually recognize object: keys are big datasets, more parameters, and longer training time
  3. Search and learning are scalable methods
    1. A method is scalable with Moore’s Law to the extent that its performance is proportional with the quantity of computational resources it is given
    2. A method is not scalable means if the its improvements are not affected by the amount of computational resources
    3. Scalability is key, but it tend to be correlated with other issues, such as:
      1. Symbolic vs statistical, hand-crafted vs learned, domain specific vs general purpose.
      2. The former rely more on human understanding. But looking at the history of AI, the statistical, learned and general purpose methods have steadily increased in importance


  1. How scalable is Supervised Learning? Not so much
    1. The learning processes has been largely scalable with neural networks
    2. Scalability is limited because it needs people to provide training data
  2. How scalable is Reinforcement Learning? Not so much
    1. A classic, model-free RL can learn a policy by trial and error, no label needed
    2. Computational power is cheap, there is little scaling
  3. So much more to learn than just a value function and policy, or what the teacher says is the right thing to do
  4. The grand challenge for the empirical knowledge of the world (knowledge representation and inference)
    1. Definition of knowledge: Knowledge is about the world’s state and dynamics
      1. State is a summary of the agent’s past that is used to predict its future
      2. To have state knowledge is to have a good summary, one that enables the predictions to be accurate
      3. The predictions themselves are the dynamic knowledge
      4. The most important things to predict are states and rewards, which of course depend on what the agent does
      5. The knowledge, for example, could be knowing how each piece moves in chess, knowing what causes what, predicting what will happen next
    2. The knowledge must be expressive (can represent all important things) , learnable (supervised or unsupervised), and suitable for inference and reasoning
    3. The sensorimotor view (sensorimotor related with the sensorimotor stage)
      1. Everything you know about the world is a fact of your data stream
      2. Knowledge is in the data
  5. An old ambitious goal is to use sensorimotor data to understand the world
    1. Being able to make prediction at every levels of abstraction
    2. This goal is well suit for scaling. It utilizes big amount of data to learning predictions and searching the best abstraction
  6. The most important advance in ML over the next 12 months will be:
    1. The ability to learn at scale from ordinary experience:
      1. From interaction with the world
      2. Without the need for a training set of labeled data
      3. In a more naturalistic way, like how a child or animal learns
      4. About how the world works, about cause and effect
    2. Enabling ML to scale to the next level
    3. Using deep reinforcement learning for long-term prediction (probably) and/or unsupervised learning
  7. New tools
    1. General value functions provides a uniform language for efficiently learnable predictive knowledge
    2. Options and option models (temporal abstraction)
    3. Predictive state representations
    4. New off-policy learning algorithms (gradient-TD, emphatic-TD)
    5. Temporal-difference networks
    6. Deep learning, representation search

Conclusion:(final thoughts)

  1. Moore’s Law strongly impact AI’s development
  2. The future of AI belongs to scalable methods, search, and learning
  3. Learning knowledge from ordinary experience is a big prize
  4. Our plan should be ambitious, scalable, and patient
  5. Scalability is unpopular among scholars because ambitious researchers want to provide a great machine which can be enhanced by their researches. Personal knowledge and experience will lead to success of the machine. However, if the algorithm can scale with hardware development, even if the algorithm cannot satisfy the needs of today, it will be efficient and effective in the future. This seems like one-step methods vs. long-term methods. Researchers should find a midpoint in their research.


Related Readings:

  1. Teach machine to play chess by RL

2. A real time prediction with Artificial Limbs

Click to access PDDCCHS-13.pdf

3. Rich NIPS 2015 RL tutorial

featured Image credited to: Alberta Machine Intelligence Institute(amii), a research lab at the University of Alberta


Author: Yuting Gui | Editor: Arac Wu |Localized by Synced Global Team: Xiang Chen

0 comments on “Richard Sutton: The Future of Artificial Intelligence Belongs to Search and Learning

Leave a Reply

Your email address will not be published. Required fields are marked *

%d bloggers like this: