Google announced this morning that its Tensor Processing Unit (TPU) — a custom chip that powers neural network computations for Google services such as Search, Street View, Google Photos and Google Translate — is now available in beta for researchers and developers on the Google Cloud Platform.
The TPU is a custom application-specific integrated circuit (ASIC) tailored for machine learning workloads on TensorFlow. Google introduced TPU two years ago, and released the second generation Cloud TPU last year. While the first generation TPU was used in inferencing only, the Cloud TPU is suitable for both inferencing and machine learning training. Built with four custom ASICs, Cloud TPU delivers a robust 64 GB of high-bandwidth memory and 180 TFLOPS of performance.
Before it opened its TPUs to the public, Google had widely implemented them internally. AlphaGo — the Google AI masterpiece that beat human champions in the ancient Chinese board game Go — used 48 TPUs for inferencing.
Cloud TPU provides a great solution for shortening the training time of machine learning models. Google Brain team lead Jeff Dean tweeted that a Cloud TPU can train a ResNet-50 model to 75% accuracy in 24 hours.
When Cloud TPU was announced, Google offered 1000 free devices for machine learning researchers. Lyft, the second-largest ride-hailing company in the US, has been using Cloud TPUs in its self-driving systems since last year. Says the company’s Head of Software Self-Driving Level 5 Anantha Kancherla, “Since working with Google Cloud TPUs, we’ve been extremely impressed with their speed — what could normally take days can now take hours.”
Alfred Spector, CTO of New York City-based hedge fund Two Sigma, says, “we found that moving TensorFlow workloads to TPUs has boosted our productivity by greatly reducing both the complexity of programming new models and the time required to train them.”
Google’s Cloud TPU is currently only in beta, offering limited quantities and usage. Developers can rent Cloud TPUs at the rate of US$6.50/Cloud TPU/hour, which seems a reasonable price considering their great compute capability.
Google also released several model implementation tools to save developers’ time and effort writing programs for Cloud TPUs, including ResNet-50 and other popular models for image classification, a transformer for machine translation and language modeling, and RetinaNet for object detection.
While Google is not directly selling its TPU chips to customers at this stage, their availability represents a challenge to Nvidia, whose GPUs are currently the world’s most-used AI accelerator. Even Google has used large numbers of Nvidia GPUs to provide accelerated cloud computing services. However if researchers switch from GPUs to TPUs as expected, this will reduce Google’s dependency on Nvidia.
Last year, Google boasted that its TPUs were 15 to 30 times faster than contemporary GPUs and CPUs in inferencing, and delivered a 30 – 80 times improvement in TOPS/Watt measure. In machine learning training, the Cloud TPU is more powerful in performance (180 vs. 120 TFLOPS) and four times larger in memory capacity (64 GB vs. 16 GB of memory) than Nvidia’s best GPU Tesla V100.
Although it’s too early to crown the Cloud TPU as the AI chip champion, the announcement of its availability has sparked excitement among researchers, and marks the beginning of Google’s ambitious move into the space of AI accelerators.
Journalist: Tony Peng | Editor: Michael Sarazen