The Synced Lunar New Year Project is a series of interviews with AI experts reflecting on AI development in 2018 and looking ahead to 2019. In this second installment (click here to read the previous article on Clarifai CEO Matt Zeiler), Synced speaks with Google Brain Researcher Quoc Le on his latest invention, AutoML, Google Brain’s pursuit of AI, and the secret of transforming lab technologies into real practices. Le is a Google Brain founding member and the craftsman behind many Google products including TensorFlow and Google Neural Translate.
What progress has AutoML made in the past year?
First of all, the biggest success as I can tell is around computer vision. Recently I gave a talk and reviewed the state of the art in computer vision, and it turns out the best three or four models in the ImageNet dataset are actually generated by AutoML. This is probably one of the fiercest competitions in computer vision at the moment, and we’ve already made the auto-generated model better than human-designed models.
What special significance do you see in AutoML?
I think there’s a phenomenon in many areas of machine learning, for example in NLP or in computer vision, where the state-of-the-art models are hand-tuned and designed by human experts — and that takes a lot of time. AutoML basically has the ability to design models automatically and better than human-designed models. You have to spend a lot of compute but it’s less human effort going into the loop.
The second part is that this year we also did a lot of research on using machine learning to automate the process of data processing and data augmentation. For example, one way to get your model to do better is to enrich your data. If you have an image of a cat and you believe that if you rotate that image a little bit it’s still a cat, or you zoom in it’s still a cat, then you enrich your data and that process is called data augmentation.
People did this in an ad hoc manner for many, many years. In our project we say let’s automate this process as well. We take a library of functions that take an image and process it. For example, brighten the image or darken the image or equalize the image. Then we say okay, let’s take a bunch of these functions and compose a little program that will process that to enrich the dataset.
When we have a lot of compute we can actually automate this process and it becomes very successful. We have this paper called “auto-augment,” for every single dataset we apply it to, it makes a huge improvement and I think that’s really promising.
Can you summarize your research in 2018?
We can divide my work in the past year into different verticals. I work on computer vision, and I also work on structured data, and also NLP. So these are the verticals, and then it’s also divided into techniques, meaning I work on model generation and on data generation.
Could you summarize AI development in 2018 in a single word?
In a single word? Well, overall I’m very excited with the progress in AutoML, I think this is a real phenomenon in the sense that with a lot of compute, you don’t train a single model. You do a lot of models and you try a lot of methods to process your data, and your result is significantly better. It’s taken off both in research and in real production. So the word I would choose would be “AutoML.”
How would you say Google Brain’s pursuit of artificial intelligence differs compared to other companies?
Maybe I can take a step back and tell you why I work on this kind of project in the first place.
In AI, there are four generations. The first generation is the Good Old-fashioned AI, meaning that you handcraft everything and you learn nothing. The second generation is shallow learning — you handcraft the features and learn a classifier.
The third generation, which a lot of people have enjoyed so far, is deep learning. Basically you handcraft the algorithm, but you learn the features and you learn the predictions, end to end. More learning than shallow learning, right?
And the fourth generation, this is something new, what I work on, I call it “learning-to-learn.” Meaning here, we should learn everything. We should learn the architectures, the data processing, the algorithm, the predictions, the features… all at the same time. I think right now we are at the beginning of generation four, of learning everything. We’ve just scratched the surface, these are very early days. At Google Brain, we’ve really invested in learning-to-learn.
Now this is one of many areas that Google Brain is looking into. Our mission is to try to build intelligence and make lives better. So for a lot of technologies that we build, we make sure that we can benefit a lot of people. For that reason we don’t work a lot on game AI, but we actually try to develop technology to solve medical imaging or improve self driving cars for example.
Recently, Waymo wrote a blog post about our collaboration between AutoML and their self-driving car project. We have automated a bunch of models for the self-driving car project.
What will be your research focus in 2019?
As I said, these are the very early day of learning-to-learn as I described it. I believe we’re entering the fourth generation of machine learning: the machine learning so that you don’t have to do much, it will learn everything. And some of my work on architecture search and AutoML is very early days, so I will continue to expand and go further in this research question of “Can you learn everything end to end?”
What are your expectations for the development of AutoML-related technologies and applications in 2019?
I think most of the commercial cloud platforms will have AutoML in one form or the other. The subset of people who can write machine learning programs is actually very small, and the set of people who want to use machine learning is a lot more. AutoML is a very good opportunity for researchers to actually transfer their technologies to many other companies, so I expect a lot of cloud companies will use AutoML.
In research, I think it will start to gain a lot of momentum. I’ve already seen a lot of really exciting papers coming from academia. Outside of Google they’re doing really interesting work in this area, some of which is also coming from China, which is pretty critical.
What are some technical challenges for AutoML to reach the next level of performance?
I think currently a lot of AutoML matters still require some manual work in the search space. One way for AutoML to work is to say “search for all possible programs,” but that’s a very large search space. So we tend to constrain that a little bit and say “search in a number of very restricted TensorFlow functions” or something like that. I think it would be great if we didn’t have to spend so much time on the search space. That’s the first thing.
The second thing is that a lot of search methods that we have are still a bit expensive. The resources are only available at the big companies, so you see a lot of research in this area from big companies. But recently there are already proposals for methods to use efficient search methods coming from academia and some other companies that are looking really promising, so I have hope. I think those are the two key parts that need to improve in the future of AutoML.
What other new technical directions do you think need to be explored in the future?
I’m very excited about unsupervised learning. I think unsupervised learning will unlock the potential of massive amounts of unlabeled data. We’ve started to see a lot of progress in the last few years. In particular I want to bring up this paper called “Bert.” I think a lot of people are familiar with Bert and the idea of using pre-trained language models to improve downstream applications, that’s pretty cool.
I also see big potential in designing scalable models, models that are really big so that you can take advantage of a lot of data but at the same time inference cost is low. So the idea is you can train a very big model, but you don’t spend too much compute to evaluate the model. So that’s also another direction that is exciting.
During the past year, concerns emerged around artificial intelligence data leakage, talent scarcity, model bias, interpretability, policy implementation, military use, job replacement, etc. Which of these concerns you most?
I think bias in prediction is probably the thing that I’m most worried about. A lot of machine learning models that we have depend on training data. A lot of the time we don’t seem to know enough about our training data for some reason. The predictions can be biased and it seems to have potential to affect a lot of people, and we don’t seem to have made enough progress in this area. So I think bias would be my choice, although I think a lot of researchers are actively thinking about it, so we will make progress.
Over the past seven years Google Brain has achieved remarkable success that few AI research labs can match. You have contributed to the development of TensorFlow, AutoML, and Google Neural Translate, all on the market now. Is there a secret to Google Brain’s successful implementation of lab technologies into products?
First of all, thank you for your kind words. I would say I think we have a lot to learn from all the labs, they are doing really fantastic work.
But what is our secret? I think one thing that’s special and unique about Google Brain is that in our environment the researchers have a lot of chances to work with really great engineers. In many other research labs, the researchers and engineers tend to either work in different places or on different teams and so on, but in Brain they’re very integrated. And I’ve had the benefit of working with very talented engineers who look at some of the research that we do and then figure out a different and better way than what I wanted initially to transfer it to a good product.
The other thing is that our environment is very bottom-up. So with a lot of research, people at the office can feel very creative about how they apply it and through what product they apply it. So sometimes you end up with very creative ways to find an application for a product, or a different directions that you’ve never seen before. I think those are the things that Google Brain did really well.
So, maybe the first element is an integrated model of research, having engineering and research integrated. That explains how TensorFlow got developed, or how Translation got launched so quickly. And the second element is a bottom-up model for research, where an engineer and research can actually figure out a way to creatively apply their technology into products.
Author: Herin Zhao, Tony Peng | Editor: Michael Sarazen