A research team from Facebook AI has proposed a Unified Transformer (UniT) encoder-decoder model that jointly trains on multiple tasks across different modalities and achieves strong performance on seven tasks with a unified set of model parameters.
UC Berkeley, Facebook AI Research and New York University researchers’ Multiple Sequence Alignments (MSA) Transformer surpasses current state-of-the-art unsupervised structure learning methods by a wide margin.
Google Brain’s Switch Transformer language model packs a whopping 1.6 trillion parameters while effectively controlling computational cost. The model achieved a 4x pretraining speedup over a strongly tuned T5-XXL baseline.
A team from Google, University of Cambridge, DeepMind, and Alan Turing Institute have proposed a new type of Transformer dubbed Performer, based on a Fast Attention Via positive Orthogonal Random features (FAVOR+) backbone mechanism.