Tag: Attention Mechanisms

AI Machine Learning & Data Science Popular Research

Tsinghua U & Microsoft Propose Fastformer: An Additive Attention Based Transformer With Linear Complexity

A team from Tsinghua University and Microsoft Research Asia proposes Fastformer, an efficient Transformer variant based on additive attention that achieves effective context modelling with linear complexity.

AI Machine Learning & Data Science Nature Language Tech Research

Google’s H-Transformer-1D: Fast One-Dimensional Hierarchical Attention With Linear Complexity for Long Sequence Processing

A Google Research team draws inspiration from two numerical analysis methods — Hierarchical Matrix (H-Matrix) and Multigrid — to address the quadratic complexity problem of attention mechanisms in transformer architectures, proposing a hierarchical attention scheme that has linear complexity in run time and memory.