long sequence processing

by Synced 2024-04-27 4

CMU & Meta’s TriForce: Turbocharging Long Sequence Generation with 2.31× Speed Boost on A100 GPU

In a new paper TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding, a research team from CMU and Meta introduces TriForce—a hierarchical speculative decoding system tailored for scalable long sequence generation, reaching up to 2.31× on an A100 GPU.

by Synced 2022-05-10 4

AI Computer Vision & Graphics Machine Learning & Data Science Research

LSTM Is Back! A Deep Implementation of the Decades-old Architecture Challenges ViTs on Long Sequence Modelling

A research team from Rikkyo University and AnyTech Co., Ltd. examines the suitability of different inductive biases for computer vision and proposes Sequencer, an architectural alternative to ViTs that leverages long short-term memory (LSTM) rather than self-attention layers to achieve ViT-competitive performance on long sequence modelling.