Embedded AI can transform a tabletop speaker into a personal assistant; give a robot brains and dexterity; and turn a smartphone into a smart camera, music player, or game console. Traditional processors, however, lack the computational power to support many of these intelligent features. Chipmakers, startups, and capital are taking this opportunity to the market.
According to a Gartner report, the chip market’s total revenue hit US$400 billion in 2017, and the figure is expected to exceed US$459 billion in 2018. Traditional chip makers are putting an increasing focus on AI chip development, venture capital is pumping significant investments into the market, and AI chip startups are emerging.
Major Processor Types in the AI Industry
CPU (Central Processing Units) are a chip designed for general computing purpose, emphasizing calculation and logic control functions. They are strong in processing single complex computing sequential tasks, but poor in large-scale data computation.
GPU (Graphics Processing Units) were originally designed for image processing but have been successfully adopted for AI. A GPU contains thousands of cores and is capable of processing thousands of threads simultaneously. This parallel computing design makes GPU extremely powerful in large-scale data computation.
FPGA (Field Programmable Gate Arrays) are programmable logic chips. This type of processor is powerful in processing small-scale but intensive data access. In addition, FPGA chips allows users to program the circuit path through its tiny logic block, to handle any kind of digital function. 
ASIC (Application-Specific Integrated Circuit) are highly customized chips tailored to provide superior performance in specific applications. However, a customized ASIC is not alterable once put into production.
Others chip types such as Neuromorphic Processing Units (NPU) — which have architecture mimicking that of the human brain — have the potential to become mainstream in the future but are still at early stages of development.
What is an AI chip?
AI Chips, also known as AI accelerators, are processors for AI-related computing tasks. Machine learning technology places great demands on computing power for training algorithms and running applications, which traditional computing hardware cannot provide. As a result the demand for specialized AI chips is growing rapidly.  AI Chips can be divided into three major application areas: training, inference on the cloud, and inference on edge devices.
Major Application Areas
Training is a process wherein algorithms analyze data, learn from it and finally obtain the intelligence to respond to real-world events. Trillions of data samples are analyzed by the algorithm during this training process. Chip makers must not only ramp up processor performance, but also provide an entire ecosystem — including hardware, framework and other supportive tools — to enable developers to shorten their AI technology development processes . Given these challenges, it’s major companies like NVIDIA and Google who are thriving in this space.
NVIDIA is the leader in training. When developers discovered GPU’s parallel computing architecture could accelerate the deep learning training process this brought a significant advantage to GPU giant NVIDIA. By seizing the opportunity, NVIDIA transformed itself into an AI computing company and developed a new GPU architecture, Volta, which emphasizes deep learning acceleration. NVIDIA ‘s GPU have been widely adopted for training machine learning algorithms, and the company now holds a virtual monopoly in the hardware training market.
Google is another big player in this market. Based on the achievements of AlphaGo and the millions of users on its cloud service, Google has strong potential in the training market. The company has developed its own TPU (Tensor Processing Units) to compete with NVIDIA. TPUs are a type of ASIC designed exclusively for deep learning and Google’s TensorFlow framework. Google says its TPU can provide 180 teraflops of floating-point performance, which is six times better than NVIDIA’s latest data center GPU Tesla V100.  
Inference on cloud
A developed machine learning model for AI application areas such as image recognition or machine translation usually comes with high complexity, and the required inference is too compute-intensive to be deployed on edge devices. Therefore, inference on cloud becomes necessary for the deployment of many AI applications. And when an app is being used by thousands of people simultaneously, the cloud server also requires robust capability to meet inference demands. In such cases, FPGAs are the top choice for cloud companies.  This type of processor is good at low-latency streaming and computing-intensive tasks. In addition, FPGAs provide a flexibility which allows cloud companies to modify the chips. Traditional chip makers, cloud service providers, and startups are all developing FPGA solutions.
Intel is one of the major players developing heterogeneous computing technology. By acquiring chip maker Altera, Intel boosted its FPGA technology expertise and developed a CPU+FPGA hybrid chip for deep learning inference on cloud. By utilizing the advantages of both processor types, this hybrid chip provides computing power, high memory bandwidth, and low latency. This technology has been adopted by Microsoft to accelerate its Azure Cloud Service.
Chinese tech giant Tencent is an example of cloud service providers developing FPGA solutions to support inference on cloud. Tencent developed China’s first “FPGA Cloud Computing” service for its cloud service Cloud Virtual Machine. Compared to a CPU-based cloud server, the FPGA integrated CVM provides better computing power to support HPC application and deep learning development.  Accessing FPGA on cloud also eliminates the need to purchase hardware, reducing the cost of developing AI application. Tencent also supports third-party AI application development for commercial use.
DeePhi Tech is a startup focused on inference on cloud. The company garnered US$40 million in funding to develop its DPU (Deep-Learning Processing Units, an FPGA based ASIC) platform. With the DNNDK (Deep Neural Network Development Kit), DeePhi Tech aims to provide a one-stop service for development and deployment of deep learning technologies. DeePhi co-founder Dr. Song Han is a respected AI researcher who proposed a methodology called “Deep Compression” to reduce model scale, workload and power consumption in order to improve deep learning efficiency.  This methodology has been adopted by chip giants such as Intel and NVIDIA.
Inference on edge
Internet connections may not always be stable, and the cloud cannot accommodate all computing loads for AI innovations. Therefore future edge devices will require more independence in their inference features. Smartphones, drones, robots, VR and AR immersive experience devices, self-driving cars and so on all require specific AI hardware support. Moreover, breakthroughs in recent years have reduced chip volume, enabling embedding on almost any device, making inference on edge more viable. To meet the demand for different devices, numerous startups are producing their own ASIC. Large chip makers are also adding AI supportive features to their processors.
Leading Chinese phone and processor producer Huawei is boosting the performance of their SoC by integrating AI chips. In collaboration with chip startup Cambricon, Huawei adopted a NPU (Neural Processing Unit, a type of ASIC from Cambricon) to advance its SoC Kirin 970 for its flagship smartphone Mate 10.  This integration enhances the the phone’s camera’s image processing features.
Chinese startup WestWell Lab’s DeepSouth neural processors are ASIC which simulate human brain neurons. The company created a DeepSouth-based brain simulator that can be used to accelerate medical devices supporting research in Parkinson’s, Alzheimer’s, and neural impairment.
Horizon Robotics is another startup concentrating on embedded artificial intelligence. The company has developed two types of ASIC to support different AI applications. Sunrise series processors are for face recognition and video analytics solutions in smart cameras. Journey series processors are for self-driving cars, and provide real-time detection and recognition processing capacity in eight categories.
AI is far from maturity, and as the AI innovation ecosystem continues to develop the chip market will fluctuate. With the possibility of new frameworks emerging for algorithm development, current leaders in the training hardware market may face new competition. The inference on the cloud market is also still growing, and competition between cloud service providers will intensify as more AI applications are developed. The inference on edge market meanwhile is an arena with both big companies and startups.
Given the ever-increasing demands of AI applications, we can expect to see more collaborations between chipmakers and developers. Artificial intelligence has already had a significant impact on the chip market, a trend that will continue into the foreseeable future.
 Chip market to top $400 billion in 2017, says Gartner: http://www.eenewseurope.com/news/chip-market-top-400-billion-2017-says-gartner-0
 详细分析人工智能芯片 CPU/GPU/FPGA有何差异?: http://www.sohu.com/a/131606094_470053
 What’s the Difference Between a CPU and a GPU?: https://blogs.nvidia.com/blog/2009/12/16/whats-the-difference-between-a-cpu-and-a-gpu/
 Difference Between FPGA and CPLD: http://www.differencebetween.net/technology/difference-between-fpga-and-cpld/
 ASIC and SoC: https://www.eetimes.com/author.asp?doc_id=1285201
 一文看懂人工智能芯片的产业生态及竞争格局: https://www.leiphone.com/news/201709/uuJFzAxdoBY7bzEL.html
 CPUs, GPUs, and Now AI Chips: http://www.electronicdesign.com/industrial/cpus-gpus-and-now-ai-chips
 Quantifying the performance of the TPU, our first machine learning chip: https://cloudplatform.googleblog.com/2017/04/quantifying-the-performance-of-the-TPU-our-first-machine-learning-chip.html
 深度学习的三种硬件方案：ASIC，FPGA，GPU；你更看好？: http://www.sohu.com/a/123176776_463982
 斯坦福博士韩松毕业论文：面向深度学习的高效方法与硬件: https://zhuanlan.zhihu.com/p/30211134
 A brief guide to mobile AI chips: https://www.theverge.com/2017/10/19/16502538/mobile-ai-chips-apple-google-huawei-qualcomm
 Huawei unveils Kirin 970 chipset with AI: http://www.zdnet.com/article/huawei-unveils-kirin-970-chipset-with-ai/
 想成为AI领域的英特尔，地平线发布两款终端视觉芯片: https://www.jiqizhixin.com/articles/2017-12-21-4
 NVIDIA launched Volta GPU computing architecture to bring speed in AI inference and training, as well as for accelerating HPC and graphics workloads.: https://nvidianews.nvidia.com/news/nvidia-launches-revolutionary-volta-gpu-platform-fueling-next-era-of-ai-and-high-performance-computing
 Google introduced its TPU (Tensor Processing Units) that accelerates the TensorFlow framework in machine learning.: https://cloud.google.com/tpu/
 IBM and U.S. AFRL announced the collaboration on a brain-inspired supercomputing system.: https://www-03.ibm.com/press/us/en/pressrelease/52657.wss
 Microsoft is working on AI chips across its different devices, top exec says: https://www.cnbc.com/2017/11/01/microsoft-working-on-ai-chips-across-different-devices-top-exec-says.html
 Huawei launched Kirin 970 – the new glagship SoC with AI capabilities: https://www.androidauthority.com/huawei-announces-kirin-970-797788/
 Intel is buying Movidius, a startup that makes vision chips for drones and virtual reality: https://www.recode.net/2016/9/6/12810246/intel-buying-movidius
 Report: Amazon working on its own AI chips for Echo devices: https://mashable.com/2018/02/12/amazon-echo-ai-chip/#VFOj98mEfqqy
 CB Insight: www.cbinsights.com
 Crunchbase: www.crunchbase.com
 Hupogu: http://www.hupogu.com
|Company Name||Website||Company Name||Website|
Analyst: Victor Lu| Editor: Michael Sarazen