Amazon Web Services has unveiled two chips and 13 machine learning capabilities and services at its AWS re:Invent conference in Las Vegas. The releases reflect Amazon’s determination to attract more developers to AWS by broadening its range of tools and services. The stock market reacted favourably, with Amazon shares rising six percent after the announcement.
Industry leader Amazon’s cloud business has seen heated competition from the Google Cloud Platform and Microsoft Azure for years. Google’s homegrown AI chip — the Tensor Processing Unit (TPU) — was introduced in 2016 and is already in its third generation. Although Microsoft Azure has yet to release its own custom chips, its rapid expansion and strong growth illustrate just how competitive the cloud business has become, and how important innovation is for players who want to stay in the race.
Catching up with other public cloud vendors in the AI chip market, AWS has finally launched its own machine learning inference chip, the AWS Inferentia. “Inference” is the process trained machine learning models use to find patterns in large amounts of data. The Inferentia chip is designed for inference but can also handle larger workloads, and is compatible with all popular frameworks including TensorFlow, Apache MXNet, and Pytorch.
Inferentia has hundreds of teraflops per chip and thousands of teraflops per Amazon EC2 Instance, and supports various data types including INT-8, mixed precision FP -16, and bfloat16. The chip achieves improved performance while lowering power consumption and costs for both training and inference.