Google’s Transformer-Based LongT5 Achieves Performance Gains by Scaling Both Input Length and Model Size
A Google Research team explores the effects of scaling both input length and model size at the same time with LongT5, a novel transformer architecture that achieves state-of-the-art performance on long-sequence tasks.