Chinese AI talent in US – founder of AutoX Jianxiong Xiao
Today’s self-driving cars are rolling tech showrooms — perceiving their environments through LIDAR (Light Imaging, Detection and Ranging), radar, GPS/IMU, passive visual, and ultrasonic proximity sensors. But a top LIDAR unit alone can cost in the tens of thousands of dollars, pricing the vehicles out of the reach of the average consumer.
Jianxiong Xiao (肖健雄), founder of startup AutoX, developed his first self-driving car in six months with seven cameras costing a total of just US$500. Xiao believes he can bring an affordable camera-first self-driving car to the market in two years.
Prior to AutoX’s 2016 launch, Xiao was a respected professor at Princeton University and a top-tier expert in computer vision, deep learning and robotics. He has received countless academic distinctions over the years: Best Student Paper Award at the European Conference on Computer Vision (ECCV) in 2012 for his paper Reconstruct the World’s Museum — which explained how to recreate the museum’s internal three-dimensional structure with pictures taken by a Google street view camera; Google Faculty Awards in 2014 and in 2015; and the NSF/Intel VEC Research Award in 2016
Xiao, 33, made this year’s Massachusetts Institute of Technology (MIT) Technology Review’s “Innovators Under 35” list in the Entrepreneurs category, for aiming “to make self-driving cars as widely accessible as computers are today.”
Synced sat down with Xiao at his Silicon Valley office. Relaxed in a white shirt, navy pants and a pair of Khaki Winklepickers, he told us that despite all the academic accolades, being named to the “Innovators Under 35” list had a special meaning. “For the first time, I was recognized as an entrepreneur! I think that’s a better fit for me.”
Xiao was born in Chaozhou, an eastern Guangdong city in China that is actually known for producing entrepreneurs. Hong Kong tycoon Li Ka-shing (李嘉誠) and Tencent chairman Pony Ma Huateng (馬化騰) both have their roots in Chaozhou. Xiao’s parents and grandparents were all businesspersons.
Big fan of computer vision
As a youngster, Xiao eschewed business for science. He was fascinated by computers and 3D vision reconstruction, which led him to pursue academic research. In 2009, he received his Master of Philosophy in Computer Science from Hong Kong’s University of Science and Technology.
Xiao then entered the PhD program at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), and upon graduation became an assistant professor at Princeton University. He was also the Founding Director of the Princeton Vision & Robotics Group, where he devoted himself to a challenging area, 3D deep learning.
Deep learning has made unprecedented progress in the areas of language (one-dimensional) and images (two-dimensional). However, existing methods are rarely applied to the three-dimensional data that drives a broad range of critical applications such as 3D object recognition, medical imaging, neuroscience, autonomous driving, and scientific simulations.
Over the past three years, Xiao initiated or participated in most research efforts on 3D deep learning, such as the introduction of ModelNet and ShapeNet (the biggest existing 3D dataset); the announcement of 3D deep learning framework Marvin, which laid the groundwork for latecomers to the field; and the introduction of Deep Sliding Shapes, a convolutional network formulation that takes a 3D volumetric scene from a RGB-D image as input and outputs 3D object bounding boxes.
“We were the pioneers of 3D deep learning,” says Xiao.
As early as 2013, Xiao considered starting a business based on self-driving cars. He believed however that the market was not yet developed, and so decided to wait, using the time to polish his skillset. In three years, when Xiao had became a respected academic in computer vision and robotics, he decided the self-driving car market had absorbed sufficient capital and talent resources, and embarked on his entrepreneurial journey.
Autonomous driving is now AI’s hottest application field. Car manufacturers and Internet giants alike understand what an enormous transformation autonomous driving will bring to the auto and transportation industries and society in general. Everyone wants a piece of the action — in late August Samsung got permission to test self-driving cars on California roads.
The SAE International (Society of Automotive Engineers) categorizes autonomous driving in five grades. Level 1 (Driver Assistance) includes one driving system — such as cruise control — that helps drivers perform steering or acceleration. Level 2 (Partial Automation) is a little fancier in that more than one system performs driving tasks without human interaction.
Most self-driving car developers today aim at Level 3 (Conditional Automation) or Level 4 (High Automation). The distinction between them is that L3 expects human drivers to respond appropriately to a request to intervene, while L4 does not. Between 2020 and 2023, self-driving cars at L3/L4 are expected to appear on city streets and highways.
The ultimate goal for self-driving cars is to reach L5 (Full Automation), wherein human drivers are not required. Some experts believe L5 will be realized by 2030 or sooner, while others remain skeptical about the possibility of ever reaching L5.
“It is ridiculous that some startups are aiming to achieve L5 in the next few years,” says Xiao. “I wonder how such business models will even allow them to survive for five years.”
For Xiao, the possibility of “Democratizing Autonomy” (AutoX’s slogan) depends not on developing new tech, but on exploiting the full potential of the good old camera.
Focus on the camera
As an important automatic driving sensor, cameras are mainly used for target identification and object tracking tasks such as lane detection, traffic signal detection, pedestrian detection, and so on.
Self-driving solutions based solely on the camera are rare in the market. Mobileye is one example. Founded in 1999, the company is focused on visual assisted driving technology, which is currently geared toward co-piloting, aimed at category L1/L2. Tesla was one of Mobileye’s early clients, but abandoned the service after an accident. Tesla’s current self-driving cars use cameras, radar, ultrasonic arrays and GPS data.
While virtually all available self-driving car systems include cameras, questions remain regarding their safety. Some manufacturers prefer LIDAR, which is not affected by sight restrictions such as fog or darkness, and can precisely determine positions by reflecting light waves between objects to measure distances.
Though Xiao is not against LIDAR, he has always believed that cameras should be prioritized. “The camera’s potential is greatly underestimated. Cameras can be theoretically more effective than human eyes.”
AutoX’s seven single-lens cameras were selected from more than 300 commercially-available models, but at this stage they still fall short of what Xiao is looking for with regard to hardware standards, automation, high dynamic range imaging, night vision and algorithm requirements.
Xiao says with the development of mobile phones over the past decade, camera technology has also achieved rapid progress. “Camera makers have the ability to create cameras to meet demand. We know what we need, and they can custom make it,” Xiao says.
Once perfected, Xiao plans to offer his camera software modules to interested auto makers. He didn’t disclose specific application scenarios, but referred to two paths suitable for AutoX’s democratization agenda: complete self-driving cars in limited environments, such as logistics delivery services or airport transportation; and semi-autonomous driving at L2/L3.
Cost is another key component in Xiao’s vision. By 2019, AutoX expects to achieve a camera-based L2.5/L3 self-driving ability. As an entrepreneur, Xiao needs to calculate the commercial viability of autonomous driving at this level and in different markets.
AutoX released its first test video this March, with a Lincoln MKZ equipped with seven monocular cameras. The car performed successfully under four different weather conditions: sunny day, rainy day, cloudy day and cloudy night.
Dovey Wan, Managing Director at AutoX investor Danhua Capital, likes what she’s seen from the company so far: “I was super impressed by the team’s execution capability when I saw their first demo in January. In only a little over two months they completed a demo on busy urban streets using only two low-end cameras.”
Wan is convinced that “camera first, sensor fusion second” is a viable approach. “We humans never fire lasers, and without any navigation or maps we may get lost, but can still drive safely.”
Regarding software, Xiao is confident his team of computer vision experts — drawn from leading US universities, Google, Microsoft and Facebook — can satisfactorily improve the robustness of AutoX algorithms.
Xiao’s nickname, Professor X, refers to the fictional character Professor Charles Francis Xavier, head of Marvel’s X-Men. Xiao likens himself to Professor Charles — not for any superpowers, but for his ability to bring different people together to do something impactful for society.
Journalist: Tony Peng | Editor: Michael Sarazen