Facebook AI Chief Yann LeCun introduced his now-famous “cake analogy” at NIPS 2016: “If intelligence is a cake, the bulk of the cake is unsupervised learning, the icing on the cake is supervised learning, and the cherry on the cake is reinforcement learning (RL).” The quip rippled across the AI community and confirmed LeCun a strong advocate of unsupervised learning, a machine learning technique that finds patterns in unlabeled data.
LeCun updated his cake recipe last week at the 2019 International Solid-State Circuits Conference (ISSCC) in San Francisco, replacing “unsupervised learning” with “self-supervised learning,” a variant of unsupervised learning where the data provides the supervision.
A look at unsupervised learning
We humans do not form our understanding of how the world works by sifting through massive labeled data. Instead, we leverage capabilities such as predicting and reasoning to infer the future from available information. Even when provided with an incomplete premise, such as missing segments in texts or occluded images, we can still extrapolate results using common sense — a capability machines lack.
LeCun’s cake analogy underscored the importance of unsupervised or “predictive learning,” which he believes can break through AI development limitations. Today’s AI technologies can easily classify images and recognize voices, but cannot perform tasks such as reasoning the relationship between different objects or predicting humans’ movements. That is where unsupervised learning can fill the blanks. As LeCun says: “Prediction is the essence of Intelligence.”
The French Scientist defines unsupervised/predictive learning as “predicting any part of the past, present, or future percepts from whatever information is available.” He explains that “the number of samples required to train a large learning machine for any task depends on the amount of information we ask it to predict. The more you ask of the machine, the larger it can be.”
Machine learning techniques such as supervised learning (which only predicts human-provided labels), and reinforcement learning (which only predicts a value function), are too narrow to create human-level intelligent machines. Unsupervised learning however, with its millions of bits of information per sample, can be used to train highly complex machines without human supervision.
What is self-supervised learning?
Last year, LeCun began rephrasing his point of view, and speaking more highly of “self-supervised learning” as a useful new ingredient for building AI’s future.
Like supervised learning, self-supervised learning learns a function from pairs of inputs and outputs. But instead of having annotators manually label the data, self-supervised learning automatically generates labels by extracting weak annotation information from the input data and predicting the rest. In this way the model can independently learn semantic feature representations of data, which can be further used in other tasks.
Classic self-supervised learning use cases include Word2vec, a technique for learning vector representations of words, or “word embeddings,” which Google Brain introduced in 2013. Word2vec has since spawned many cutting-edge language models, including 2018’s Google BERT and OpenAI GPT. Another typical self-supervised learning model is Autoencoders, where input and output data are entirely the same.
Other domains that apply self-supervised learning include photo restoration and image super-resolution.
LeCun told the ISSCC audience that although self-supervised learning does not currently work very well with high-dimensional continuous signals such as video prediction tasks, he still sees the tech as game-changing: “The next AI revolution will not be supervised or purely reinforced. The future is self-supervised learning with massive amounts of data and very large networks.”
RL community fires back: Intelligence is a cake with many cherries!
Many in the reinforcement learning community — who see their method as the path forward — find LeCun’s cake analogies difficult to swallow. DeepMind researchers responded to the first slide with their own image of a cake topped with countless cherries, arguing that reinforcement learning with its many reward signals can reflect significant value.
At NIPS 2017 UC Berkeley Professor Pieter Abbeel escalated the dessert warfare, including the DeepMind cherry cake image in his presentation and joking “I prefer to eat a cake with a lot of cherries because I like reinforcement learning.” Since LeCun’s criticism on pure reinforcement learning methods mainly focuses on sparse reward signals, Abbeel illustrated his point with Hindsight Experience Replay, a novel, sample-efficient learning technique that tries to get a reward signal from any experience by simply assuming the goal equals whatever happened. Although the classic reinforcement learning method regards failure as zero reward, Abbeel proposed that even failures can be used to train machines.
One of the most popular memes in the AI community, LeCun’s playful cake metaphor has come to symbolise the very serious debate on which current machine learning research path is most likely to lead to artificial general intelligence.
Click here to download LeCun’s ISSCC presentation slides.
Journalist: Tony Peng | Editor: Michael Sarazen