Alibaba is well aware of the growing demand for dedicated compute to power today’s AI applications. Last year, the Hangzhou-based tech giant launched its semiconductor subsidiary Pingtouge (“Honey Badger” in Chinese) to develop embedded chip and neural network accelerators. At the time, Alibaba CTO Jeff Zhang pledged Pingtouge would produce the world’s most advanced neural network chip by the middle of this year.
Today, Alibaba kept its promise. At the Alibaba Cloud (Aliyun) Apsara Conference 2019, Pingtouge unveiled its first AI dedicated processor for cloud-based large-scale AI inferencing. The Hanguang 800 is the first semiconductor product in Alibaba’s 20-year history.
Also announced was a new AI cloud service based on Hanguang 800, that boasts 100 percent more efficient performance than traditional GPU-based services.
The 12-nm Hanguang 800 contains 17 billion transistors. Given an inference image classification benchmark test on ResNet-50, Hanguang 800’s peak performance is 78,563 images per second (IPS). Zhang says the Hanguang 800 is 15 times more powerful than the NVIDIA T4 GPU, and 46 times more powerful than the NVIDIA P4 GPU. The chip’s peak efficiency is 500 IPS/W.
The Pingtouge chip development team took seven months to complete the chip design process and another three months for tape-out.
The Hanguang 800 is being implemented across many application scenarios within Aliyun, ranging from video classification to smart city applications. For example, the company’s popular Pailitao platform applies visual image search to e-commerce, allowing customers to search for items by taking a photo of the query object. Using AI-based image recognition & indexing powered by the new Hanguang 800, Aliyun can increase image processing efficiency by 12 times compared to GPUs.
With regard to smart city tech, Aliyun says it previously used 40 traditional GPUs to process videos of central Hangzhou with a latency of 300ms. Now the task requires only four Hanguang 800 with a lower latency of 150ms. In the near future, the chip is also expected to be used for medical imaging and autonomous driving, says Zhang.
Alibaba is not directly selling its Hanguang 800 chips to customers at this stage, just as Google is not selling its TPU. Developers can rent Hanguang 800 time on the new AI cloud service Aliyun announced today, which asks developers to describe what they want to do with the service and request a cloud compute quota. Aliyun says the new service is 100 percent more cost-effective than traditional GPUs.
The Hanguang 800 release is the latest in a series of tech announcements from Alibaba: At August’s World Artificial Intelligence Conference (WAIC) in Shanghai, Pingtouge launched “Wujian” (“No Sword” in Chinese) — a system-on-a-chip (SoC) design platform for AI and IoT scenarios.
In July, Pingtouge introduced an RISC-V (Reduced Instruction Set Computer) processor. The Xuantie 910 will be used as a core IP to produce high-end edge-based microcontrollers (MCUs), CPUs, and SoCs. The processor is tailored for 5G, artificial intelligence, and IoT, and will be open-sourced in the near future.
Chinese tech giants Huawei, Alibaba, and Baidu have all hopped on the AI accelerator bandwagon. While the Hanguang 800 is designed for inferencing only, Huawei’s new Ascend 910 AI computing chip can handle both training and inference for AI models, packing twice the performance of rival NVIDIA’s Tesla v100. Baidu last year unveiled its edge-to-cloud chip Kunlun, which includes the training chip “818–300” and the inference chip “818–100”. The 14-nm Kunlundelivers performance of 260 TOPS while consuming 100 Watts of power.
Despite the Hanguang 800’s eye-opening performance, Alibaba Pingtouge remains a fledgling in the semiconductor business that faces a long tech journey before it can hope to join the world’s leading chip producers.
Journalist: Tony Peng | Editor: Michael Sarazen