Multilingual Language Model

by Synced 2024-12-17 20

From Token to Conceptual: Meta introduces Large Concept Models in Multilingual AI

A research team at Meta introduces the Large Concept Model (LCM), a novel architecture that processes input at a higher semantic level. This shift allows the LCM to achieve remarkable zero-shot generalization across languages, outperforming existing LLMs of comparable size.

by Synced 2024-04-08 5

AI Machine Learning & Data Science Research

AURORA-M: A Global Symphony of Innovation as 33 Prestigious Institutions Unify for Open-Source Multilingual Mastery

A collaborative effort involving researchers from 33 institutions presents AURORA-M, the inaugural open-source model not only excels in multilingual understanding and coding tasks but also underscores the collaborative ethos of the open-source community, promoting transparency and accessibility in AI development.

by Synced 2023-05-25 6

AI Machine Learning & Data Science Research

Meta AI’s Massively Multilingual Speech Project Scales Speech Technology to 1000+ Languages

In the new paper Scaling Speech Technology to 1,000+ Languages, a Meta AI research team launches the company’s Massively Multilingual Speech (MMS) project, which aims to dramatically expand access to speech technologies.

by Synced 2022-12-22 1

AI Machine Learning & Data Science Research

Google’s Mu2SLAM: Toward a Single Model For All Speech and Text Understanding Tasks

In the new paper Mu2SLAM: Multitask, Multilingual Speech and Language Models, a Google Research team presents Mu2SLAM, a multilingual sequence-to-sequence pretraining method for speech and text models that covers arbitrary tasks in over 100 languages.

by Synced 2022-03-30 0

AI Machine Learning & Data Science Research

CMU & Google Extend Pretrained Models to Thousands of Underrepresented Languages Without Using Monolingual Data

A research team from Carnegie Mellon University and Google systematically explores strategies for leveraging the relatively under-studied resource of bilingual lexicons to adapt pretrained multilingual models to low-resource languages. Their resulting Lexicon-based Adaptation approach produces consistent performance improvements without requiring additional monolingual text.

by Synced 2021-08-25 3

AI Machine Learning & Data Science Nature Language Tech Research

Apple Neural TTS System Study: Combining Speakers of Multiple Languages to Improve Synthetic Voice Quality

An Apple research team explores multiple architectures and training procedures to develop a novel multi-speaker and multi-lingual neural TTS system. The study combines speech from 30 speakers from 15 locales in 8 languages, and demonstrates that for the vast majority of voices, such multi-lingual and multi-speaker models can yield better quality than single speaker models.

by Synced 2021-06-14 1

AI Machine Learning & Data Science Nature Language Tech Research

Google Researchers Merge Pretrained Teacher LMs Into a Single Multilingual Student LM Via Knowledge Distillation

A Google Research team proposes MergeDistill, a framework for merging pretrained teacher LMs from multiple monolingual/multilingual LMs into a single multilingual task-agnostic student LM to leverage the capabilities of the powerful language-specific LMs while still being multilingual and enabling positive language transfer.

by Synced 2021-04-22 2

AI Nature Language Tech Research

Are Multilingual Language Models Fragile? IBM Adversarial Attack Strategies Cut MBERT QA Performance by 85%

An IBM research team proposes four multilingual adversarial attack strategies and attacks seven languages in a zero-shot setting on large multilingual pretrained language models (e.g. MBERT), reducing average performance by up to 85.6 percent.