A new paper from Julia Computing Co-Founder and CTO Keno Fischer and Senior Research Engineer Elliot Saba introduces a method and implementation for offloading sections of Machine Learning models written in Julia programming language to TPUs.
Tensor Processing Units (TPUs) are Google’s custom-developed Application Specific Integrated Circuit (ASICs) used to accelerate machine-learning workloads. Google Cloud TPUs are the cutting edge hardware architecture for training today’s computationally demanding deep learning and machine learning models. A Google Cloud TPU machine learning accelerator was first made available to the public in 2017. Fischer and Saba’s method works by leveraging the Lower Level XLA (Accelerated Linear Algebra) Compiler that Google released in August 2018.
Mapping Julia Semantics to XLA
In order to offload Julia code to TPU, Julia code must be compiled to XLA code. To achieve this, the Julia compiler needs to bridge the gap between the dynamic semantics of the language and the static semantics of the LLVM (Low Level Virtual Machine) representation. If we can find a way to convert Julia code to XLA “High Lever Optimizer” (HLO) input language, then Julia can function on TPUs.
Julia programs are written in terms of functions and abstractions provided by Julia’s base library and use a multiple dispatch method, which provides the possibility of expressing their own operations in term of HLO operations. A few examples of this are shown below:
The paper also provides implementations of the higher level array abstractions, in particular, mapreduce and broadcast. Normally the HLO operation of a broadcast implementation is around 20 lines of code and omitted for space, but the implementation of ‘mapreduce’ is simply:
Evaluation on TPUS
To demonstrate that the Julia compiler is able to work on TPUs with no major issues, the paper includes examples such as VGG19 (Visual Geometry Group 19 convolutional neural network architecture) Forward Passing, and VGG19 Backward Passing. Below are some results with notes excerpted from paper:
The new method has been welcomed by ML researchers and garnered praise from Google AI Lead Jeff Dean, who tweeted “Julia + TPUs = fast and easily expressible ML computations!”
The paper Automatic Full Compilation of JULIA Programs and ML Models to Cloud TPUs is on arXiv.
Author: Robert Tian | Editor: Michael Sarazen