Synced is proud to present Gary Marcus as the last installment in our Lunar New Year Project — a series of interviews with AI experts reflecting on AI development in 2018 and looking ahead to 2019. (Read the previous articles on Clarifai CEO Matt Zeiler and Google Brain Researcher Quoc Le.)
Marcus is a respected New York University Professor and founder of Uber-acquired Geometric Intelligence. Last year, Marcus aired his concerns and criticisms of deep learning, igniting a social media firestorm involving a number of high-profile AI researchers. In this interview Marcus speaks on the Twitter war with Facebook Chief AI Scientist Yann LeCun, deep learning’s shortfalls, and general artificial intelligence.
Can you summarize your research/work in 2018?
I worked on several things in 2018, but most relevant here is a series of papers I wrote outlining the limits of deep learning and the general status of the field of artificial intelligence. These were much more skeptical than most of what’s been written. The two main articles were Deep Learning A Critical Appraisal, and Innateness, AlphaZero, and Artificial Intelligence. The first one is extremely widely read, I think everybody in the field seemed to have an opinion about it.
I tried to lay out the limitations of deep learning and pointed to ten problems, primarily in terms of how much data deep learning uses and how poorly it generalizes. The second article was about innateness and how much should be built into an AI system. I think the pendulum has moved from the early days of AI, where people built-in almost everything; to systems that try to learn almost everything and have almost nothing built-in — and I don’t think that’s the right place to be either. I think that there’s a feeling in the field that it’s better to learn everything, it’s almost like a prejudice or a bias toward learning everything from scratch. This is in part sociology, where people who do learning are dominating the field. They’ve got some nice results recently, but people have to understand what those results are and what the limitations are.
Fundamentally, I think it’s important to realize that intelligence is many things, it’s not just one thing. We have built systems that can do some aspects of intelligence; but in others humans are far better than machines. And we need to understand why humans are far better than machines.
If you were to describe AI development over the past several years in a single word, which word would you choose?
The one word I would use is “overhyped.”
That connects to my next question: Over the past year, you have criticized deep learning in terms of its limitations, and debated researchers on this, especially Yann LeCun. Why are you so obsessed with criticizing deep learning?
I mean I have other things I do with my life too! (laughs) Another thing that I did last year was write a book on the state of AI and why we need to change directions. The Twitter wars I had with Yann LeCun were just a small part of what I did, but it’s important to me, so let me give you some background.
My background is as a cognitive scientist. As a small child, I programmed computers and got interested in artificial intelligence. But then I did my graduate work on humans, in part because I felt like the artificial intelligence wasn’t really making much progress. And when I look at current AI through a perspective of human cognitive development — how children learn — I’m very dissatisfied by the state of AI. There’s no AI that’s remotely as clever as my four year old or my six year old.
There are things that those AIs can do, but they’re very narrow. A machine that can play StarCraft better than the best players in the world, and yet it can’t understand what to do with a train set, which my kids were able to do since age two. Every day I compare the state of AI, which is my day job; with the state of my children, which is my night job — and the contrast overwhelms me. I look at what’s going on and I see how hyped it is, how everybody is so excited about deep learning. As a cognitive scientist I look at the mechanisms that people are using, and I think there’s this huge confusion about what they’re good for.
There used to be a term, I guess it’s probably from the Ancient Greeks, called the “fallacy of composition.” The idea is that you find something that’s true in one case, and you automatically assume that it’s true in all cases. I think this is what’s happening in deep learning. I’ve never been very tolerant of errors and logic in analysis, I have a tendency to find arguments that have been oversold and find weaknesses about them. I was trained partly as a philosopher as an undergraduate, and that’s what philosophers do, they look for arguments that aren’t really right. And one the ways that arguments aren’t really right is when people overgeneralize them. With deep learning we have a clear case of overgeneralization. A case where people have found something works for a certain set of problems, and assume that it will work for all problems. And that’s nonsense.
In fact, we can look systematically, and see the kind of problems that deep learning is good for or not. It’s very good at image classification, it’s very good at labeling what part of speech a word is. It’s not very good at understanding a scene or the meaning of a sentence. It’s not really good enough to drive a domestic robot around your home. There are lots of things it doesn’t do very well.
Some of the debates that you saw on Twitter were sparked by me noticing that the field is actually moving a little bit, and people are not liking that. I have been saying for several years that deep learning is shallow, that it doesn’t really capture how things work or what they do in the world. It’s just a certain kind of statistical analysis. And I was really struck when Yoshua Bengio, one of the fathers of deep learning, kind of reached the same conclusion. I used to say this and people looked at me funny and got mad at me. And then suddenly here was one of the leaders of the field noticing the same thing. So I said look, Bengio’s actually saying what I was saying in 2012. That’s really what launched the last big set of Twitter discussions in November and December.
I think this is very revealing of a couple things. One is actually the sociology of how the field works. So right now, LeCun is trying to shut me up basically, he doesn’t want me to speak. He’s attacking my credentials rather than the arguments and that’s never a good sign. When somebody does an ad hominem argument and attacks the person rather than the underlying ideas, it’s because they don’t have a good argument, and that’s what I feel has happened with LeCun. He’s also misrepresented me a lot, enough so that I wrote an article in Medium where I pointed out places in which he’s been misrepresenting and strawmanning my position.
LeCun has been representing me as saying that ‘deep learning is terrible’ but that’s not actually what I’m saying. What I’m saying is it’s not sufficient in itself. It’s good for the set of problems that I outlined, like image classification, but it’s not good for the others.
Ultimately the metaphor I have in mind is a toolkit. You want a screwdriver, you want a hammer, you want a power screwdriver, a plane… you want a lot of different tools. I think it’s fine to use deep learning as one of those tools. I think it would be misleading to say you can build a whole house with that one tool.
What is your relationship with Yann LeCun now?
In English we have a word, ‘frenemy.’ We’re friends and we’re enemies, although when he misrepresents me, the emphasis starts to move to the second part of that.
If you watch the debate we had in 2017, I think we got along quite well. We had a wonderful dinner afterwards with a bunch of people, talked all night and it was great. I feel like last year he was much more ad hominem, and the arguments became less productive.
Did the debate with the deep learning community change your opinions?
I’m not sure my fundamental view has changed about deep learning, you can read this essay I wrote in the New Yorker in 2012. I think I would still stand by what I said there, which is basically that deep learning is a very good tool for some things, but it’s not good for abstraction, language, causal reasoning and so forth — and I stand by all of that. So in that sense, I don’t think my position has changed.
I do think people have been very clever about how they’ve used deep learning. It’s almost like if all you have is a screwdriver, you can try to adapt everything to be a screwdriver sort of problem. They’ve been good at, for example, using deep learning to make old video games have higher resolution. There have been a lot of clever applications in deep learning and it’s certainly had a lot of impact on the world, but I don’t think that it’s really solved the fundamental problems of artificial intelligence.
You have pointed out problems with deep learning and one of the suggestions you’ve given is leveraging the advantage of symbolic systems and combining that with deep learning. Could you give us any examples?
I don’t think that what I want exists in AI yet. If you remember Daniel Kahneman’s work on system 1 and system 2 cognition, it’s a little bit like that. We have different systems for solving different problems. His version is we have a reflexive system that works automatically, and a deliberative system that works through reasoning. You can argue some of the details, but I think the general intuition behind that is correct.
Another thing we talk about that’s related in psychology is bottom-up perception versus top-down perception. Any psychology textbook will tell you — and there’s lots and lots of experiments that point in this direction — that if we do bottom-up perception, we can recognize the pixels and so forth.
But for example, I’m looking right now at you in video conference and there are these little rectangles in your eyeglasses that are a reflection of the computer screen. If I look closely enough at them, I see myself. But I don’t for a minute think that I am living in your eyeglasses. The pixels, in a bottom-up way, appear to be consistent with two images of Gary Marcus inside your glasses. But my top-down perception says no, that’s not possible. First of all, there can’t be two Gary Marcus’s, unless he has an identical twin. And then they’re too small sitting there in your eyeglasses. They’re also a little bit too blurry. So a better interpretation of what’s going on is that they’re reflections of a monitor that I can’t actually see. I see you, not the monitor. So I put together an explanation of what’s going on that is much more like classical AI in terms of being about entities like eyeglasses, reflections, mirrors, and so forth — than just a simple classification of pixels. I can’t learn this from data. I don’t have any pre-labeled pictures of your glasses with my reflection or whatever.
So what neuroscience or psychology or cognitive neuroscience — whatever you want to call it — tells me is that there are different pathways to vision. I’m certainly using all the information I get from the pixels, but I’m also using information I know about the world — like how eyeglasses work, how reflections work, the size of people and so forth — in order to put this all together. Then I’m looking at you, and I try to make an analysis — ‘he’s nodding his head, I think he understands what I’m saying; or he looks lost and he doesn’t so I’ll change my conversation.’ But the tools that we have don’t do that. So the way that I would describe it to you as a human being involves concepts like glasses and reflections and so forth. So I think there’s potential to bring these two things together in AI just as they have been brought together in human evolution.
If you ask me if anybody has a really good code base doing this out there for their commercial product, probably not. I think we still need to make some discoveries. I once wrote a line about how neuroscience had not yet found its Isaac Newton. And you could extend that to AI, I’m not sure that AI has found its Isaac Newton yet. There are some fundamental ideas that we have hints about, but we don’t really understand. So the code that people actually write at some level is mostly hacks. Almost everything that anybody has ever written in AI is brittle and narrow — it works for the circumstances for which it was built, and it can’t adapt. Whereas people can adapt to all kinds of things that are a little bit different from what they’ve seen. Like if I tell you ‘somebody fell off a very big ladder,’ you can start thinking about that even if you’ve never run into those circumstances.
Is there any recent research work that in your opinion promises huge potential?
Not exactly, but I think there’s some things that are pushing in a good direction. Things like graph networks, things where people are at least coming to terms with the reality that knowledge‘s structure is not just one big vector that’s a long list of numbers. I don’t think anybody has solved those problems, but at least they’re trying to take them seriously now, and that’s making them think about a broader class of models, and I think that’s what we need.
What will be your research focus in 2019?
I’m getting pretty interested in robotics. I’m not going to say too much more just yet. I think robotics is a good field in which to test ideas about for example common-sense reasoning — how do you reason about the way the world works with general intelligence? So if you were to build a domestic robot — which people have been talking about for decades — that could wander the home from the kitchen to the living room and pick things up or help people in various ways, you really have to understand the world at a deep level, not just a shallow superficial level.
Another issue with deep learning is it often works maybe 80 percent of the time, and then produces bizarre errors 20 percent of the time. If you’re using a recommendation engine or photo tagging, the cost of error is very low. If I label a bunch of photos and get one wrong, it’s probably okay.
But the home is a context where errors really do matter. You don’t want your robot to bump into a candle that falls on a table and your house catches fire. You really have to have AI that works solidly and in a trustworthy way. The book I’m writing with Ernest Davis is about how to make AI that is trustworthy and reliable. That revolves around getting machines to actually have enough common sense that they can think through the likely consequences of their actions, and robotics is a good domain for that.
What are your expectations for AI development in 2019?
I’m not expecting any huge advances, but it’s certainly possible. If there were big advances, we may not hear about them immediately because it takes a while to take a good idea and put it into practice. So what I expect in 2019 is a broader range of things that you can ask a system like Amazon’s Alexa and Siri and so forth. I don’t expect real conversations with machines like that this year. There’s a lot of attempts to have at least some kind of robot in the home. They’re relatively primitive right now, but we’ll see more and more about that.
I don’t know if we’ll see a foundational change in AI, but it’s possible. There’s enough people in the last year who have recognized there’s a limit to the deep learning paradigm that maybe someone will really address that and come up with something new.
Could you tell us your opinion on the AI technology race between China and the US?
The race is on! Right now I think China is doing more things right than the US. I think the historical advantage is certainly to the US, which has had a better graduate educational system for fostering creativity and innovation and so forth. But right now, the US has a president who is very narrowly focused in his own way. He’s not fostering science and technology, and that’s not good. For example, we’ve been turning away lots of well-qualified immigrants, and well-qualified immigrants is one of the reasons we’ve done so well. So I think as long as our current president is in office, we’re not doing that well. On the other hand, I think the current leadership in China is very interested in AI and is obviously putting a lot of money into it.
So the US is ahead right now, but whether it stays that way is not clear. Of course our current president can’t be there more than six more years and we’ll have a new president. I would say that China is investing enough resources and has enough people that were well-trained in the States and have come back, things like that. China is certainly catching up fast, and can overtake the US.
During the past year, concerns emerging around artificial intelligence included data abuse, talent scarcity, model bias, interpretability, policy implementation, military use, job replacement, etc. Which of these concerns you most?
My biggest concern right now, and this is a lot of what my new book is gonna be about, is that AI isn’t really reliable yet. If you use something that’s not reliable in a mission where you really need reliability, you get into trouble. So that can be anything from having AI sort people’s job applications to controlling weapons. If the underlying AI isn’t reliable but we count on it, we have a problem. My fundamental view is that right now AI is not reliable. It’s not something that you can use in an open-ended world. The best techniques we have are narrow techniques that work for very specific problems like Go, where rules never change and you can collect as much simulated data as you want. But when you open these things into the open-ended world and you for example have them drive cars, well they often work but you can’t really count on them.
In some cases, people will use them anyway and there’s going be accidents and fatalities; and in some cases we may delay the AI. There’s a secondary concern, which is people might actually give up on AI if a bunch of problems like building chatbots and driverless cars turn out to be much harder than the hype would suggest. A lot of people thought we’d have driverless cars by 2020, and we have them as prototypes, but we can’t count on them yet. And it’s not clear how long it’s going take to get to the point where they really can be counted on.
So if we get to 2025 and they’re still all demos that require human beings in them, people might get tired of AI, might pull the funding and who knows what might happen? Meanwhile, if somebody says they’re going to do trials of their driverless cars in my neighborhood, I’m worried because I don’t think they’re trustworthy enough yet.
They work most of the time, but you don’t know when they’re gonna do something totally bizarre.
Journalist: Tony Peng | Editor: Michael Sarazen