AI Machine Learning & Data Science Research

BERT-Style Pretraining on Convnets? Peking U, ByteDance & Oxford U’s Sparse Masked Modelling With Hierarchy Leads the Way

In the new paper Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling, a research team from Peking University, ByteDance, and the University of Oxford presents Sparse Masked Modelling with Hierarchy (SparK), the first BERT-style pretraining approach that can be used on convolutional models without any backbone modifications.

AI Machine Learning & Data Science Research

CMU & Google Extend Pretrained Models to Thousands of Underrepresented Languages Without Using Monolingual Data

A research team from Carnegie Mellon University and Google systematically explores strategies for leveraging the relatively under-studied resource of bilingual lexicons to adapt pretrained multilingual models to low-resource languages. Their resulting Lexicon-based Adaptation approach produces consistent performance improvements without requiring additional monolingual text.

AI Machine Learning & Data Science Nature Language Tech Research

Google, NYU & Maryland U’s Token-Dropping Approach Reduces BERT Pretraining Time by 25%

In the new paper Token Dropping for Efficient BERT Pretraining, a research team from Google, New York University, and the University of Maryland proposes a simple but effective “token dropping” technique that significantly reduces the pretraining cost of transformer models such as BERT without hurting performance on downstream fine-tuning tasks.

AI Machine Learning & Data Science Research

Is BERT the Future of Image Pretraining? ByteDance Team’s BERT-like Pretrained Vision Transformer iBOT Achieves New SOTAs

A research team from ByteDance, Johns Hopkins University, Shanghai Jiao Tong University and UC Santa Cruz seeks to apply the proven technique of masked language modelling to the training of better vision transformers, presenting iBOT (image BERT pretraining with Online Tokenizer), a self-supervised framework that performs masked prediction with an online tokenizer.

AI Machine Learning & Data Science Research Share My Research

UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual Embeddings Using the Unified Medical Language System Metathesaurus

UmlsBERT is a deep Transformer network architecture that incorporates clinical domain knowledge from a clinical Metathesaurus in order to build ‘semantically enriched’ contextual representations that will benefit from both the contextual learning and domain knowledge.