Five years ago a University of Toronto team led by Dr. Geoffrey Hinton used two GPUs to train the image recognition model AlexNet in a record time of six days — and GPUs have been powering AI research ever since. However, as researchers take on more challenging tasks, they need more compute power. NVIDIA Founder and CEO Jensen Huang believes he has the answer: “The world needs a gigantic GPU.”
At the GPU Technology Conference in Santa Clara, USA today, Huang unveiled the world’s largest GPU — a binary beast packed with 16 Tesla V100 with doubled memory 32 GB, 81920 CUDA Cores, 2,000 TFLOPS Tensor Cores, and a bandwidth of 300 GB/Seconds between each GPUs.
It’s like looking under the hood of a muscle car. And it’s an AI researcher’s dream.
Incorporating 16 GPUs in a single machine raised huge technical challenges in GPU interconnectivity. NVIDIA developed a new GPU interconnect fabric, NVSwitch, an upgrade on NVIDIA NVLink that delivers bandwidth five times higher than the best PCIe switch, enabling systems with higher GPU hyperconnectivity.
NVIDIA announced that its gigantic GPU is now integrated into the DGX-2, the company’s latest supercomputer for offices and data centers. DGX-2 is the world’s first system to deliver performance of two PFLOPS, has 512GB HBM2 of memory, energy consumption of 10,000 watts, and 1.5TB system memory. The DGX-2 can train AlexNet in just 18 minutes, 500 times faster than the Hinton team in 2012.
The DGX-2 is aimed at general academic institutions or established enterprises who demand substantial computing in AI research. It will go on sale in quarter three for US$399,000. Huang joked in his keynote speech: “The more GPUs you buy, the more money you save!”
The release of new GPU and DGX-2 is expected to consolidate NVIDIA’s data center business, which doubled to US$2 billion in annual revenue in 2017 to become the company’s second largest revenue source. Last December, NVIDIA controversially prohibited the deployment of its consumer-side GPU GeForce series in data centers. This was believed to be a measure to defend the company’s own data center business.
Also announced today was Clara, a data center medical imaging supercomputer for researchers to train models on reconstructing 3D images, detecting brain tumors, and cinematic rendering.
NVIDIA enhanced its cloud platform NVIDIA GPU Cloud with the release of TensorRT 4.0, the company’s latest high-performance deep learning inference optimizer. TensorRT 4.0 can accelerate AI applications, such as image recognition, speech synthesis, and natural language processing, and reduce data center power consumption by 70%. It incorporates with today’s most widely used AI open source framework, Google TensorFlow 1.7.
The NVIDIA GPU Cloud also added Kubernetes, a portable, extensible open-source platform for managing containerized workloads and services. Launched by Google in 2014, Kubernetes can help the NVIDIA GPU Cloud manage computing resources, particularly data centers on the cloud, in a cluster orchestration. This enables portability across infrastructure providers.
Amazon Web Services, Google Cloud Platform, AliCloud, and Oracle Cloud users can access the NVIDIA GPU Cloud.
NVIDIA is sending a message to the AI community: Its “gigantic GPU” will save researchers time training AI models so they can put more time into AI innovation. While Intel and Google have been catching up in the AI computing market in recent years, NVIDIA’s new product announcements are expected to ramp up its market share in the critical data center business and dramatically expand its influence on the cloud.
Journalist: Tony Peng| Editor: Michael Sarazen