Meta AI’s MegaByte Scalable Architecture for Long Sequence Modelling Outperforms Existing Byte-Level Models
In the new paper MegaByte: Predicting Million-Byte Sequences with Multiscale Transformers, a Meta AI research team presents MegaByte, a multiscale decoder architecture that enables million-byte sequence modelling.