BERT | Synced

by Synced 2023-11-24 4

ETH Zurich’s UltraFastBERT Realizes 78x Speedup for Language Models

In a new paper Exponentially Faster Language Modelling, an ETH Zurich research team introduces UltraFastBERT, a variant of the BERT architecture. UltraFastBERT takes a revolutionary approach by replacing feedforward layers with fast feedforward networks, resulting in an impressive 78x speedup over the optimized baseline feedforward implementation.

by Synced 2023-01-19 2

AI Machine Learning & Data Science Research

BERT-Style Pretraining on Convnets? Peking U, ByteDance & Oxford U’s Sparse Masked Modelling With Hierarchy Leads the Way

In the new paper Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling, a research team from Peking University, ByteDance, and the University of Oxford presents Sparse Masked Modelling with Hierarchy (SparK), the first BERT-style pretraining approach that can be used on convolutional models without any backbone modifications.

by Synced 2022-03-30 0

AI Machine Learning & Data Science Research

CMU & Google Extend Pretrained Models to Thousands of Underrepresented Languages Without Using Monolingual Data

A research team from Carnegie Mellon University and Google systematically explores strategies for leveraging the relatively under-studied resource of bilingual lexicons to adapt pretrained multilingual models to low-resource languages. Their resulting Lexicon-based Adaptation approach produces consistent performance improvements without requiring additional monolingual text.

by Synced 2022-03-29 1

AI Machine Learning & Data Science Nature Language Tech Research

Google, NYU & Maryland U’s Token-Dropping Approach Reduces BERT Pretraining Time by 25%

In the new paper Token Dropping for Efficient BERT Pretraining, a research team from Google, New York University, and the University of Maryland proposes a simple but effective “token dropping” technique that significantly reduces the pretraining cost of transformer models such as BERT without hurting performance on downstream fine-tuning tasks.

by Synced 2021-11-23 2

AI Machine Learning & Data Science Research

Microsoft’s DeBERTaV3 Uses ELECTRA-Style Pretraining With Gradient-Disentangled Embedding Sharing to Boost DeBERTa Performance on NLU Tasks

Microsoft releases DeBERTaV3, improving the original DeBERTa model using ELECTRA-style pretraining with gradient-disentangled embedding sharing to achieve better pretraining efficiency and a significant performance jump.

by Synced 2021-11-18 1

AI Machine Learning & Data Science Research

Intel’s Prune Once for All Compression Method Achieves SOTA Compression-to-Accuracy Results on BERT

An Intel research team presents Prune Once for All (Prune OFA), a training method that leverages weight pruning and model distillation to produce pretrained transformer-based language models with high sparsity ratios. Applied to BERT, the approach achieves state-of-the-art results in compression-to-accuracy ratio.

by Synced 2021-11-17 0

AI Machine Learning & Data Science Research

Is BERT the Future of Image Pretraining? ByteDance Team’s BERT-like Pretrained Vision Transformer iBOT Achieves New SOTAs

A research team from ByteDance, Johns Hopkins University, Shanghai Jiao Tong University and UC Santa Cruz seeks to apply the proven technique of masked language modelling to the training of better vision transformers, presenting iBOT (image BERT pretraining with Online Tokenizer), a self-supervised framework that performs masked prediction with an online tokenizer.

by Synced 2021-08-24 3

AI Machine Learning & Data Science Nature Language Tech Research

Huawei Proposes Topic-Based Personalized Web Search Ranking, Integrating User Interests and Semantic Matching

A Huawei research team proposes a topic-based personalized ranking model (TPRM) that integrates pretrained contextualized term representations and user profiles constructed by a topic model to tailor the output ranking list.

by Synced 2021-05-14 9

AI Machine Learning & Data Science Popular Research

Google Replaces BERT Self-Attention with Fourier Transform: 92% Accuracy, 7 Times Faster on GPUs

A research team from Google shows that replacing transformers’ self-attention sublayers with Fourier Transform achieves 92 percent of BERT accuracy on the GLUE benchmark with training times seven times faster on GPUs and twice as fast on TPUs.

by Synced 2021-05-04 2

AI Machine Learning & Data Science Research

Huawei & Tsinghua U Method Boosts Task-Agnostic BERT Distillation Efficiency by Reusing Teacher Model Parameters

A research team from Huawei Noah’s Ark Lab and Tsinghua University proposes Extract Then Distill (ETD), a generic and flexible strategy for reusing teacher model parameters for efficient and effective task-agnostic distillation that can be applied to student models of any size.

by Synced 2021-04-28 3

AI Machine Learning & Data Science Research

Google’s 1.3 MiB On-Device Model Brings High-Performance Disfluency Detection Down to Size

A research team from Google Research proposes small, fast, on-device disfluency detection models based on the BERT architecture. The smallest model size is only 1.3 MiB, representing a size reduction of two orders of magnitude and an inference latency reduction of a factor of eight compared to state-of-the-art BERT-based models.

by Synced 2021-04-22 2

AI Nature Language Tech Research

Are Multilingual Language Models Fragile? IBM Adversarial Attack Strategies Cut MBERT QA Performance by 85%

An IBM research team proposes four multilingual adversarial attack strategies and attacks seven languages in a zero-shot setting on large multilingual pretrained language models (e.g. MBERT), reducing average performance by up to 85.6 percent.

by Synced 2021-01-29 1

AI Machine Learning & Data Science Research Share My Research

UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual Embeddings Using the Unified Medical Language System Metathesaurus

UmlsBERT is a deep Transformer network architecture that incorporates clinical domain knowledge from a clinical Metathesaurus in order to build ‘semantically enriched’ contextual representations that will benefit from both the contextual learning and domain knowledge.

by Synced 2020-10-28 2

Machine Learning & Data Science Nature Language Tech Popular

Amazon’s BERT Optimal Subset: 7.9x Faster & 6.3x Smaller Than BERT

Amazon extracts an optimal subset of architectural parameters for BERT architecture by applying recent breakthroughs in algorithms for neural architecture search.

by Synced 2020-09-01 3

Machine Learning & Data Science Nature Language Tech

AMBERT: BERT with Multi-Grained Tokenization Achieves SOTA Results on English and Chinese NLU Tasks

AMBERT (A Multigrained BERT) leverages both fine-grained and coarse-grained tokenizations to achieve SOTA performance on English and Chinese language tasks.

by Synced 2020-08-12 1

AI Hot Machine Learning & Data Science Nature Language Tech

How Smart is BERT? Evaluating the Language Model’s Commonsense Knowledge

Researchers dive deep into the large language model to discover how it encodes the structured commonsense knowledge it leverages on downstream commonsense tasks.

by Synced 2020-07-14 10

AI Machine Learning & Data Science Nature Language Tech Research

Facebook & CMU Introduce TaBERT for Understanding Tabular Data Queries

TaBERT-powered neural semantic parsers showed performance improvements on the challenging benchmark WikiTableQuestions and demonstrated competitive performance on the text-to-SQL dataset Spider.

by Synced 2020-05-29 0

AI Machine Learning & Data Science Nature Language Tech Research

DeepMind Says Syntactic Biases ‘Helped BERT Do Better’

Researchers add syntactic biases to determine whether and where they can help BERT achieve better understanding.

by Synced 2020-05-28 1

AI Machine Learning & Data Science Nature Language Tech Research

Google Introduces BLEURT – a BERT-Based NLG Evaluation Metric

Google Research team proposes the automatic metric BLEURT which is based on the highly successful Google language model BERT.

by Synced 2020-03-13 0

AI Machine Learning & Data Science Nature Language Tech Research

BERTLang Helps Researchers Choose Between BERT Models

Researchers from Bocconi University have prepared an online overview of the commonalities and differences between language-specific BERT models and mBERT.

by Synced 2020-02-29 1

AI Machine Learning & Data Science Nature Language Tech Research

BERT-of-Theseus: Compressing BERT by Progressive Module Replacing

Researchers propose a novel model compression approach to effectively compress BERT by progressive module replacing.

by Synced 2020-02-18 2

AI Machine Learning & Data Science Nature Language Tech Research

Up Close and Personal With BERT – Google’s Epoch-Making Language Model

A recent Google Brain paper looks into Google’s hugely successful transformer network — BERT — and how it represents linguistic information internally.

by Synced 2020-01-29 2

AI Machine Learning & Data Science Nature Language Tech Research

Facebook AI mBART: The Tower of Babel’s Silicon Solution

Facebook AI researchers have further developed the BART model with the introduction of mBART.

by Synced 2020-01-23 0

AI Machine Learning & Data Science Nature Language Tech Research

Hallo! Hallo! KU Leuven & TU Berlin Introduce ‘RobBERT,’ a SOTA Dutch BERT

A group of researchers from The Katholieke Universiteit Leuven and The Technical University of Berlin recently introduced a Dutch RoBERTa-based language model, RobBERT.

by Synced 2020-01-03 0

AI Machine Learning & Data Science Nature Language Tech Research

Google Releases ALBERT V2 & Chinese-Language Models

Google has now released a major V2 ALBERT update and open-sourced Chinese ALBERT models.

by Synced 2019-11-21 0

AI Research

Voila! SOTA French Language Model ‘CamemBERT’ Debuts

Now, a team from Facebook AI Research, Inria, and Sorbonne Université have released CamemBERT, essentially a French version of Google AI’s game-changing pretrained language model BERT.

by Synced 2019-10-27 0

AI AI Weekly Industry Research

Google’s Big Week: Quantum Supremacy, Boosting Search With BERT, Exploring Transfer Learning

Synced Global AI Weekly October 27th

by Synced 2019-10-25 0

AI Industry Research United States

Milestone: BERT Boosts Google Search

In what the company calls “the biggest leap forward in the past five years, and one of the biggest leaps forward in the history of Search,” Google today announced that it has leveraged its pretrained language model BERT to dramatically improve the understanding of search queries.

by Synced 2019-10-07 2

AI Asia China Research

Huawei’s TinyBERT Is 7X Smaller and 9X Faster Than BERT

Researchers from the Huazhong University of Science and Technology and Huawei Noah’s Ark Lab have introduced TinyBERT, a smaller and faster version of Google’s popular large-scale pre-trained language processing model BERT.

by Synced 2019-07-24 0

AI Research

Has BERT Been Cheating? Researchers Say it Exploits ‘Spurious Statistical Cues’

Since Google Research introduced its Bidirectional Transformer (BERT) in 2018 the model has gained unprecedented popularity among researchers. Now, a group of researchers from the National Cheng Kung University Tainan in Taiwan are challenging BERT’s efficacy.

by Synced 2019-06-27 55

AI Research

The Staggering Cost of Training SOTA AI Models

While it is exhilarating to see AI researchers pushing the performance of cutting-edge models to new heights, the costs of such processes are also rising at a dizzying rate.

by Synced 2019-02-20 0

AI Research

PyTorch Reimplementation of OpenAI GPT-2 Small Model Released

Github developer Hugging Face has updated its repository with a PyTorch reimplementation of the GPT-2 language model small version that OpenAI open-sourced last week, along with pretrained models and fine-tuning examples.