In 1993, on his 30th birthday, Jensen Huang gave himself the best present ever by founding NVIDIA, the graphic-processor company where he still serves as CEO. Huang and his NVIDIA team pioneered the graphics processing chip (GPU) in 1999, revolutionizing the visual performance of device displays. But even Huang never dreamt that his GPU would one day become a driving force in the arena of artificial intelligence (AI).
In 2011 at Stanford University, Andrew Ng, one of the top minds in AI, discovered that a dozen NVIDIA GPUs could perform as well as 2,000 CPUs in training deep learning models. A GPU contains thousands of cores and is capable of processing thousands of threads simultaneously. This parallel architecture makes GPU extremely powerful in large-scale but straightforward data computation.
Other top academic institutes and laboratories quickly followed Ng and deployed GPUs for deep learning research with the belief that GPU could eliminate bottlenecks in computing capabilities that had vexed AI researchers for years and spark technological breakthroughs.
NVIDIA quickly recognized this trend and pivoted its strategy to become an AI computing company. Over the last six years, the Santa Clara chipmaker has been pushing the limits of GPU architecture, releasing a cutting edge GPU every one or two years to empower computing-hungry applications and products. Over the last five years, the number of developers with expertise in GPU has grown tenfold; CUDA downloads five times; and total GPU flops of the top 50 systems 15 times.
Since 2009, NVIDIA has been hosting the annual GPU Technology Conference (GTC), which showcases company releases and provides an exhibition venue for GPU-based innovations. GTC attendance has soared from 2000 attendees in 2012 to the 8,500 developers, buyers and innovators who went to Santa Clara, California for GTC 2018 last month.
The role and deployment of GPUs is changing dramatically, and they are creating significant new market opportunities. Synced visited GTC 2018 to explore a world based on GPUs.
The NVIDIA squad
Tech giants usually have a capital investment arm — such as Google Ventures or Microsoft Ventures — to fund AI startups. NVIDIA GPU Ventures invests in and nurtures next-generation companies built on GPU. A number of NVIDIA portfolio companies have become high-profile AI startups: H2O.ai, Element.ai, and Drive.ai.
At GTC 2018, a large black semi truck parked outside the San Jose McEnery Convention Center became a centre of attention. The prototype was from TuSimple, a leading Chinese autonomous-driving truck company founded by entrepreneurs Mo Chen and Dr Xiaodi Hou from the California Institute of Technology.
The TuSimple semi operates on an accelerator fusion of NVIDIA GPUs — including GTX 1080Ti, NVIDIA Drive PX and Jetson TX — to process huge amounts amount of data.
Drive PX is NVIDIA’s first in-vehicle supercomputer, introduced in 2015. As computing needs for self-driving vehicles have ramped up, NVIDIA has developing powerful vehicle-specific processors to help the company stay ahead of the race. NVIDIA recently successively unveiled the advanced Drive PX Pegasus and what it billed as the world’s most powerful in-vehicle System-on-Chip (SoC), Xavier.
Last June, TuSimple completed a 200-mile high-automated test drive from San Diego to Yuma, Arizona. The company’s fast tech development attracted interest from NVIDIA GPU Ventures, which joined a group of investors led by Chinese social media company Sina putting more than US$20 million into TuSimple last August.
NVIDIA is not just writing checks. In June 2016, the company introduced a virtual incubator, NVIDIA Inception Program, to nurture AI startups. In 18 months over 2000 companies have applied to the program, and only 70 have so far been accepted. NVIDIA also hosts the Inception Competition at GTC, where its portfolio companies compete for NVIDIA Inception Awards and US$1.5 million in prize money.
Israeli cybersecurity startup Deep Instinct was one of the winners last year. The company uses a GPU-based neural network and CUDA to achieve 99 percent detection rates, compared with about 80 percent detection from conventional cybersecurity software. Last June, NVIDIA pumped US$10 million into Deep Instinct. “NVIDIA introduced us to their strategic accounts which are now our customers, which has been very helpful,” said a Deep Instinct representative at GTC 2018.
On this year’s final pitch day, robotic arm company Kinema Systems won the NVIDIA Inception Award and took home US$375,000. The company’s flagship product is the AI-powered industrial robotic vacuum grabber Kinema Pick, which runs on GTX 1060 and NVIDIA embedded AI computing device Jetson TX2.
Kinema Founder and CEO Sachin Chitta says NVIDIA provides portfolio companies with “special treats,” such as a discount on NVIDIA hardware, a training course conducted by NVIDIA experts, and not least a chance to appear at GTC: “This conference provides a great opportunity for exposure. We have met many customers and investors who showed a huge interest in our products.”
GPU is empowering medical health
Tech giants believe AI can reimagine conventional diagnostic methodologies in medical health, increasing accuracy and reducing costs. IBM has been using slides to train deep neural networks to detect tumours since 2016. Google has successfully produced a tumour probability prediction heat map algorithm whose localisation score reached 89 percent, significantly outperforming pathologists’ average of 73 percent.
The medical health field was not widely addressed in the last few GTCs, but NVIDIA is now putting more effort into this area.
At GTC 2018 NVIDIA unveiled Project Clara — a medical imaging supercomputer deployed on its cloud platform. Clara is designed to transform standard medical images such as X-rays, ultrasound scans, CTs, MRIs, PETs, and mammograms into high-resolution cinematic renderings. Because deploying GPUs in every clinic and hospital would not be cost-effective, Clara provides its users with high-performance cloud-based services.
NVIDIA added medical health talks and panels at GTC 2018, and invited renowned professionals to represent their research achievements based on GPUs.
Thomas Fuchs is the Founder and CEO of New York startup Paige.ai, founded this January to fight cancer with AI. The company has access to a dataset of 25 million pathology images and financial support from Breyer Capital, which led a US$25 million Series A Funding Round.
In an interview with Synced, Fuchs said he believed the time was right to build Paige.ai because the requirements were all in place: qualified devices, extensive collection of medical images, and full-fledged deep learning algorithms.
Most importantly, GPU advancement drives deep learning development with an unprecedented scale of medical image data. Paige.ai has now built a high performance compute cluster with hundreds of NVIDIA GPUs.
GPU database is a real thing
GPUs will not replace the central processing unit (CPU), which is still extremely powerful in performing the necessary arithmetic, logical, control and input/output (I/O) operations for example for personal computers. However, GPU are better than CPU in particular tasks: they are unquestionably the best chip for image processing in gaming, and their parallel architecture happens to be very well-suited for deep learning.
A few innovative startups have discovered that GPU acceleration can also deliver better performance than CPU in databases, particularly in repetitive operations on large amounts of data.
Online Transaction Processing Databases (OLTP) primarily handle day-to-day transactions for example for banks. Oracle is the dominant vendor here, accounting for nearly 50 percent of the market share. Online Analytical Processing (OLAP) databases meanwhile are designed to handle complex analysis of large volumes of data, and consequently many AI applications are now running on OLAP databases. The estimated market size for OLAP databases is US$22.8 billion in 2020, a lucrative emerging market for startups to target.
GPU database companies first appeared in 2016. Silicon Valley pioneers Kinetica and MapD have raised US$50 and US$37 million respectively; and Israel’s SQream US$31.5 million. Last year, Chinese database leader Zilliz raised an undisclosed amount (reportedly RMB￥100 million or US$16 million).
Says Zilliz Founder and CEO Charles Xie, “Using a GPU to run a database can be traced back to 15 years ago in academia. But it didn’t work as the top software has long been constrained by the underlying hardware. Now, NVIDIA and other chipmakers are building an infrastructure for GPUs, helping developers to reduce the threshold of developing GPU-based applications.”
Zilliz’s GTC 2018 booth showcased the company’s cutting-edge GPU database, which it says can increase the performance of data processing up to 100 times over CPUs; reduce hardware cost by 10 times; and lower data centre operation and energy costs by up to 95 percent.
The SQream booth was a few meters from Zilliz’s. Founded in 2010, SQream spent their six years conducting research and building databases, and launched its commercial product in 2016.
“If you have tens of hundreds of terabytes [data], and you try to do some database operation on what we called Massive Parallel Processing (MPP) databases, sometimes the query goes up to 30 minutes to an hour. When you put data into the SQream instead of MPP, the query goes down to five minutes,” says SQream Senior Solutions Architect Arnon Shimoni.
As AI applications thrive with complex neural networks and large datasets, researchers and developers can of course invest in GPU clusters — but another option is to purchase on-demand services on the cloud for GPU-intensive tasks.
This premise is growing the market for GPU-as-a-Service (GaaS) solutions, which is set to exceed US$5 billion by 2024, according to a new research report by Global Market Insights.
Major cloud service vendors AWS, Google Cloud, and Microsoft Azure have been hosting NVIDIA GPU-equipped virtual machines for their cloud machine learning services for some time. Google began incorporation of NVIDIA GPU in its cloud computing centres back in November 2016, just months after AWS announced a new Elastic Compute Cloud (EC2) instance type, dubbed P2, which leverages NVIDIA GPUs.
“GaaS will be used for augmented reality, but will also able to handle massively parallel complex app problems like encryption (or decryption), weather forecasting, business intelligence graphical displays, big data comparisons,” writes Jack Gold, founder and principal analyst at analyst firm J. Gold Associates.
Although a few cloud services giants will dominate the market for GaaS, small and medium enterprises still have a chance to take a piece of the pie.
Cirrascale Cloud Services is a San Diego-based company that enables researchers and data scientists to attach GPU acceleration to a wide range of tasks over the network. Although its service is similar to AWS or Google Cloud, the company appeals to users who train models for weeks or even months at a time by offering a 35 percent lower price point and 35 percent faster speed compared to AWS.
Says Cirrascale Executive Account Manager Andrew Kruszewski, “We realized that there was a market, people [researchers] would come and test on the system, and they wouldn’t want to get off. They also didn’t want to own the equipment because NVIDIA changes their GPU so often.”
Launched just three years ago, Cirrascale’s cloud services have been growing so quickly that the company had to cut other divisions. Last year, its design and manufacturing business was sold to BOXX technologies.
Over the last two years, NVIDIA’s stock price has skyrocketed thanks to the rapidly increasing role AI is playing in the company’s revenue growth. Full-year revenue of 2017 was US$9.71 billion, up 41 percent from a year earlier, and its discrete GPU market share increased to 72.8% during the third quarter of 2017.
However, rivals Google and Intel are catching up, and developing their own AI chips to challenge NVIDIA. This February, Google announced that its Tensor Processing Unit (TPU) — a custom chip that powers neural network computations — will be available in beta for researchers and developers on the Google Cloud Platform.
Last year, Google boasted that its TPUs were 15 to 30 times faster than contemporary GPUs and CPUs in inferencing, and delivered a 30 – 80 times improvement in TOPS/Watt measure. In machine learning training, TPU are more powerful in performance (180 vs. 120 TFLOPS) and two times larger in memory capacity (64 GB vs. 32 GB of memory) than NVIDIA’s top GPU Tesla V100.
Joe Pelissier, Distinguished Engineer at Cisco Systems, says that a serious challenger may even replace NVIDIA in the next three to five years.
“The type of mathematics for machine learning you basically need to be able to do is multiplication. Everything else is at least two orders of magnitude less significant. So you can imagine Silicon Valley, there are a lot of folks saying ‘hey, if i take out a lot of functionality that GPU has, and only leave the stuff it needs for machine learning, I can either make it cheaper, or I can put more cores in it, or a combination of both’,” says Pelissier.
Many experts believe that while NVIDIA GPU were not initially created for AI, they are now embedded in it and cannot be easily replaced.
Brett Newman, VP marketing and customer engagement at compute hardware company Microway, says NVIDIA has done a good job building the ecosystem. “They are making software tools better applied for deep learning training. And they are making developer friendly things like Digits [Deep Learning GPU Training System]. I think that stuff is setting them up for success that will persist into the long-term.”
Huang quipped at GTC 2018 that “NVIDIA is still a small company with only 10,000 employees.” But he is too humble. It’s no small achievement to have built NVIDIA from a birthday present into a US$150 billion chip giant in 25 years. And now, the company’s GPU have become the muscle powering AI research and innovation. It’s been an incredible journey for NVIDIA, one that will continue to empower AI long into the future.
Journalist: Tony Peng| Editor: Michael Sarazen