Machine Learning & Data Science Popular

TinySpeech: Novel Attention Condensers Enable Deep Recognition Networks on Edge Devices

Novel attention condensers designed to enable the building of low-footprint, highly-efficient deep neural networks for on-device speech recognition on the edge.

Advances in natural language processing (NLP) driven by the BERT language model and transformer models have produced SOTA performances in tasks such as speech recognition and powered a range of applications including voice assistants and real-time closed captioning. The widespread deployment of deep neural networks for on-device speech recognition however remains a challenge, particularly on edge devices such as mobile phones.

In a new paper, researchers from the University of Waterloo and DarwinAI propose novel attention condensers designed to enable the building of low-footprint, highly-efficient deep neural networks for on-device speech recognition on the edge. The team demonstrates low-precision “TinySpeech” deep neural networks comprising such attention condensers and tailored specifically for limited-vocabulary speech recognition.

ts-9.9.png

While there has been increased research focus on efficient network design in recent years, there are still limitations to what can be achieved using existing deep convolutional neural network design patterns, says DarwinAI Chief Research Scientist and the paper’s first author Alexander Wong. Motivated to move beyond these limitations, the researchers developed attention condensers as a new building block for such networks.

Attention condensers are stand-alone architectures that allow a deep learning model to focus on what’s important, facilitating more effective and trustworthy decisions, DarwinAI CEO Sheldon Fernandez told Synced. The proposed attention condensers are self-attention mechanisms that learn and produce a condensed embedding characterizing joint local and cross-channel activation relationships and perform selective attention accordingly.

Unlike self-attention mechanisms designed for deep convolutional neural networks that depend heavily on existing convolution modules, these attention condensers act as self-contained, stand-alone modules enabling more efficient deep neural networks.

Each attention condenser consists of a condenser layer C, which condenses input activations V to a lower dimension; an embedding structure E, for characterizing cross-dimensional activation relationships; an expansion layer X, which expands the embedding to a higher dimension; a scale S for controlling the contribution of self-attention; and a selective attention mechanism F for imposing selective attention on V. By learning such embeddings with significantly reduced dimensions, the attention condensers strike a balance between modelling efficiency and computational efficiency.

ts-9:9.png
Architecture of an attention condenser


The researchers leveraged their attention condensers within a machine-driven design exploration strategy to create what they call “TinySpeech networks.” In experiments using the Google Speech Commands benchmark dataset for limited-vocabulary speech recognition, these TinySpeech networks demonstrated significantly lower architectural and computational complexity compared to previously proposed deep neural networks designed for limited-vocabulary speech recognition.

ts-tinyspeech.png
TinySpeech architectures for limited-vocabulary speech recognition


“The key takeaways from this research is that not only can self-attention be leveraged to significantly improve the accuracy of deep neural networks, it can also have great ramifications for greatly improving efficiency and robustness of deep neural networks,” Wong told Synced.

Given the promising early results, the researchers say they will explore the use of attention condensers in building highly-efficient deep neural networks for other NLP tasks, in areas such as visual perception and drug discovery, and for empowering various ultra-low-power TinyML technologies.

The paper TinySpeech: Attention Condensers for Deep Speech Recognition Neural Networks on Edge Devices is on arXiv.


Reporter: Yuan Yuan | Editor: Michael Sarazen


B4.png

Synced Report | A Survey of China’s Artificial Intelligence Solutions in Response to the COVID-19 Pandemic — 87 Case Studies from 700+ AI Vendors

This report offers a look at how China has leveraged artificial intelligence technologies in the battle against COVID-19. It is also available on Amazon Kindle. Along with this report, we also introduced a database covering additional 1428 artificial intelligence solutions from 12 pandemic scenarios.

Click here to find more reports from us.


AI Weekly.png

We know you don’t want to miss any latest news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

1 comment on “TinySpeech: Novel Attention Condensers Enable Deep Recognition Networks on Edge Devices

  1. Pingback: TinySpeech: Novel Attention Condensers Enable Deep Recognition Networks on Edge Devices – IAM Network

Leave a Reply

Your email address will not be published.

%d bloggers like this: