Google & Columbia U’s Mnemosyne: Learning to Train Transformers With Transformers
In the new paper Mnemosyne: Learning to Train Transformers with Transformers, a research team from Google and Columbia University presents Mnemosyne Optimizer, a learning-to-learn system for training entire neural network architectures without any task-specific optimizer tuning.