Tag: multi layer perceptron

AI Machine Learning & Data Science Research

Meta AI’s Sparse All-MLP Model Doubles Training Efficiency Compared to Transformers

Researchers from Meta AI and the State University of New York at Buffalo propose sparsely-activated all-MLP architectures (sMLPs) that achieve training efficiency improvements of up to 2x compared to transformer-based mixture-of-experts (MoE) architectures, transformers, and gMLP.