AI Machine Learning & Data Science Research

Google’s Mu2SLAM: Toward a Single Model For All Speech and Text Understanding Tasks

In the new paper Mu2SLAM: Multitask, Multilingual Speech and Language Models, a Google Research team presents Mu2SLAM, a multilingual sequence-to-sequence pretraining method for speech and text models that covers arbitrary tasks in over 100 languages.

Allen AI & UW Propose Unified-IO: A High-Performance, Task-Agnostic Model for CV, NLP, and Multi-Modal Tasks

In the new paper Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks, a research team from the Allen Institute for AI and the University of Washington introduces UNIFIED-IO, a neural model that achieves strong performance across a wide variety of vision, language, and multi-modal tasks without task- or modality-specific branches or fine-tuning.