“I am tempted by Alibaba’s large pool of business scenarios and huge user base,” says Xiaofeng Ren.
The renowned computer vision specialist recently flew back to China from Seattle, making his first conference stop in Beijing since joining Alibaba as Deputy Dean and Chief Scientist of Institute of Data Science and Technologies (iDST).
After obtaining his PhD from UCLA under the supervision of Dr. Jitendra Malik, Ren was appointed Affiliate Professor at the University of Washington’s Department of Computer Science. In his role as Amazon’s Senior Principal Scientist, Ren led the computer vision team, while also serving as chair at both CVPR and ICCV (top computer vision conferences), where his papers were cited more than 9,000 times.
IDST is Alibaba’s core R&D institute covering machine learning, big data mining, natural language processing, mobile search, and multimedia identification; with offices in Hangzhou, Beijing, Seattle, and Silicon Valley. It is also in charge of the artificial intelligence brain for Alibaba’s NASA plan.
Rong Jin, Dean of iDST, respects Ren’s previous employer for its operational style. He told Li Xiang Business Review “Amazon and Google are different, Amazon makes influential products without hiring too many technologists, which testifies for the company’s capability to form ideas. Making a product is related to user experience, thus not as simple as developing a technology. For a technology to make a huge impact, it has to operate in a market environment with full-range user experience.”
Amazon has introduced products such as Alexa and Amazon Go to help broaden its scope outside of E-commerce. The company stresses efficiency, and currently has only 40 people working on computer vision, compared to 200 at Google.
Ren tells us that he wants to “realize something [at iDST] that hasn’t been successfully done before.” To this end, he is back in China to familiarize himself with Alibaba’s team operations, and will take it from there.
Synced recently spoke with Ren about the big move.
On Joining Alibaba
Synced: You are building a team in Seattle for Alibaba, what is its functionality?
Xiaofeng Ren: There are many talents in the United States, which serves the purpose of globalizing Alibaba’s talent pool. The team hopes to attract first-class American talent. We don’t have the team yet, and many ideas remain on the surface. I will still have to discuss the specifics with Rong Jin regarding coordination, division of labour, etc.
Synced: Are there differences in vision or goals for the Seattle team?
Xiaofeng Ren: Not really. The Seattle team should be able to sync with the Chinese team.
Synced: What are some of the changes in computer vision R&D over the past few years?
Xiaofeng Ren: The technology is entering a commercial development phase. When I began it was still very theoretical. Mostly we just ran the algorithms on a few images, not knowing when they could be put to use. I’ve witnessed a lot of progress from graduation to now.
Synced: What impacted you along the way?
Xiaofeng Ren: Microsoft’s Kinect (a line of motion sensing input devices for Xbox and Windows PC introduced in 2010). I think from that point on, I was able to see more practical application scenarios. I conducted research on depth cameras for this reason.
Synced: What technical research or application scenarios do you hope to focus on?
Xiaofeng Ren: Technically speaking, my interest is on how to make things more efficient, processing image and video data accurately. The innovation of these technologies can help everyone to improve experience. But real-time systems in real-world environments require high technical expertise and are now a frontier research direction.
On Unmanned Retail Stores
Synced: Unmanned retail shops have attracted a lot of attention, what role did computer vision technology play? What are the challenges?
Xiaofeng Ren: Computer vision accounted for many things for sure, but I can’t disclose the specifics. One big challenge is precision, the algorithm needs to solve many problems based on many data, and obtaining such a wide range of data sources is a problem in itself.
Synced: If computer vision were used for the entire shopping process, would the cost be too high?
Xiaofeng Ren: In the long run, camera hardware and computing costs fall very quickly. But in terms of specific costs, it is really hard to say.
Synced: What are the specific applications of computer vision technology for retail? Are there bottlenecks?
Xiaofeng Ren: I think there are big differences in both basic and applied research. For retail, perceiving merchandise and the activities of customers is very important; which may not be the case for example for autonomous driving. As for bottlenecks, everyone’s constantly thinking about new areas of application. Of course there are many technical challenges because when people make unpredictable gestures, we need to adjust the product accordingly.
Synced: What applications of computer vision are best for commercial use?
Xiaofeng Ren: I really hope that computer vision can have a wide range of applications, such as in retail, office or domestic environments. There are many companies using computer vision for offices. Once there are intelligent applications, even a simple thing like phone conferences can generate many business streams.
On Research and Development
Synced: What is your personal understanding of computer vision? We understand that you have applied psychology as a framework to study computer vision.
Xiaofeng Ren: On one hand, I was greatly influenced by my PhD supervisor Professor Jitendra Malik, who has a very broad range of interests. There are people who focus on the field per se, but I always find problem solving to be a interdisciplinary matter. I’m paying attention to all kinds of research. On the other hand, I was also influenced by my experience at Intel Lab, where they conducted research on many things including robotics and HD. I started to look into other research too. For a scientist, a multi-dimensional company can help you develop a richer perspective, which is especially good for career development.
Synced: What’s your take on the hype around deep learning? There are people that are skeptical about its bottlenecks. Much of your previous research combines deep learning with traditional computer vision technology.
Xiaofeng Ren: Deep learning has made revolutionary progress for sure, and it has helped our capabilities to a great extent. However there are people like me who have been immersed in the field for a long time, and we have our own stream of ideas, and we feel that deep learning is not enough to solve many problems. We need another breakthrough, or to go ahead and explore for ourselves.
Synced: Based on past work experience, what is the relationship between basic research and product development?
Xiaofeng Ren: This depends on the company, the team, or the particular project you are working on. Generally speaking, I want products to lead the way, but at the same time for the R&D team to have space to do something else.
Synced: Is there a need to balance basic research and product development?
Xiaofeng Ren: For sure. Different companies have different approaches. Amazon for example does not rely exclusively on basic research or product development, these are really intertwined. The products are constantly changing and updating.
Journalist: Yan Liu | Localization: Meghan Han | Editor: Michael Sarazen
Click to read this article in Chinese.