Tag: Knowledge Distillation

AI Machine Learning & Data Science Research

Does Knowledge Distillation Really Work? NYU & Google Study Provides Insights on Student Model Fidelity

A research team from New York University and Google Research explores whether knowledge distillation really works, showing that a surprisingly large discrepancy often remains between the predictive distributions of the teacher and student models, even when the student has the capacity to perfectly match the teacher.

AI Machine Learning & Data Science Nature Language Tech Research

Google Researchers Merge Pretrained Teacher LMs Into a Single Multilingual Student LM Via Knowledge Distillation

A Google Research team proposes MergeDistill, a framework for merging pretrained teacher LMs from multiple monolingual/multilingual LMs into a single multilingual task-agnostic student LM to leverage the capabilities of the powerful language-specific LMs while still being multilingual and enabling positive language transfer.