Synced spoke with AI pioneer Professor Yoshua Bengio at the Computing in the 21st Century Conference in Beijing, where he discussed his recent research and the current state of AI.
Q: What open questions are there in your research area?
A: One of questions we had five years ago was how to build better generative models. But we have been able to make huge progress in the ability to generate images after the proposals of the VAEs (variational autoencoders) and the GANs (generative adversarial networks).
We are still working on problems like disentangling underlying factors of variation, which is a question I asked ten years ago. We’ve made some progress but no that much, so it is still an open question. At that time, I was hoping the techniques we had would magically do the right thing, now I think we need some priors to push things into the right direction.
Q: What kind of priors are you seeking in learning mechanisms?
A: We would like to find learning mechanisms that exploit as little prior as possible, because less prior means more general mechanisms. At the extreme where the prior contains all the knowledge, there is no learning. The smaller the set of knowledge the model has, the more likely it will apply to vast areas of different learning problems. We want to build AI as adaptive as possible, so we should try to minimize the amount of priors we put into model. Also, the numbers of bits required to formulate the priors should be small.
Q: Many of your recent and ongoing research projects are connected to your 2017 paper Consciousness Prior. What was your purpose for introducing this?
A: Deep learning has made a lot of progress in perception and one of its main ambitions is to design algorithms that learn better representations. Good representations should probably be connected to the high-level concepts that we use in language, but this is not something we know how to do with unsupervised learning yet. That kind of high-level knowledge about the world isn’t something current deep learning takes advantage of.
One way to think about knowledge is that you are able to make a statement of the world using very few variables. The knowledge can be represented with many nuggets, each of which represents very few variables. In contrast, in current machine learning, we learn about joint distributions of many variables (e.g., all of the pixels of an image) that are very high-dimensional. The interesting thing is that the sparsity of the representation is only possible in the right representation space and it does not work in pixel space.
So the consciousness prior is the idea that we can use this incentive regulator for representations to force them to have the property that we can extract just a few dimensions at a time and make powerful predictions about the future. It’s a way to add an extra constraint to learning representations so that they will be good at expressing classical AI symbolic knowledge.
Also, to really understand a sentence in natural language, you need have a system which understands what the words in the sentence refers to in real world, with images, sounds and perception. One more idea in the consciousnesses prior is to ground those high-level concepts and rules in low-level perceptions. The research directions go to grounded language learning and multimodal language models, where the learning is not just texts but also their associations with images, videos and sensory perceptions.
Q: What is main problem in current NLP systems?
A: State-of-the-art NLP systems, e.g., the best translation systems using deep learning and attention, still make very many stupid mistakes that no human will make. The reason is that machines don’t really understand what those sentences mean, they do not comprehend anything. And the problem is that we are training current NLP deep learning systems using only texts, but it is not easy to find unconscious knowledge about the world just by reading texts.
Grounded language now is being studied, where people are trying to build agents which can both process text and have a good world model of how the environment works. That is difficult but that is what AI is about because real AI should understand the world like humans.
Q: In your research area, what else do you think limits development of human-level AI?
A: Causality. Making progress in representing causal factors as the right level of representation is deeply connected to what we want deep networks to do. I don’t think we understand how to do causal inference in a way that is sufficiently general. A lot of techniques have been proposed, but we need more general-purpose ways to match them well with deep learning.
Q: What is your view on popular misconceptions around AI?
A: There are many misunderstandings about AI and deep learning, one of them is that we’ve already solved AI. I think that we’ve made a lot of progress but there’s even more to be done and that the current system is still very stupid.
Another issue is that people have a lot of fear about AI. They are afraid of the impact on jobs, the potential for killer robots, the security and privacy issues and so on. There are good and there are bad reasons to be afraid. A good reason is that such high power technology could be misused by companies or by governments, so we need to make sure there are collective rules of the game so that AI is used to benefit all. On the other hand, there are fears of AI taking over humanity, and I think that could happen, but is not very likely. We should not be too concerned right now.
Author: Tingting Cao | Editor: Michael Sarazen | Producer: Chain Zhang