Tag: Natural Language Processing

AI Machine Learning & Data Science Nature Language Tech Research

Training Compute-Optimal Large Language Models: DeepMind’s 70B Parameter Chinchilla Outperforms 530B Parameter Megatron-Turing

In the new paper Training Compute-Optimal Large Language Models, a DeepMind research team posits that current large language models are significantly undertrained and, based on empirical outcomes of over 400 training runs, proposes three predictive approaches for optimally setting model size and training duration.

AI Machine Learning & Data Science Nature Language Tech Research

Google, NYU & Maryland U’s Token-Dropping Approach Reduces BERT Pretraining Time by 25%

In the new paper Token Dropping for Efficient BERT Pretraining, a research team from Google, New York University, and the University of Maryland proposes a simple but effective “token dropping” technique that significantly reduces the pretraining cost of transformer models such as BERT without hurting performance on downstream fine-tuning tasks.

AI Machine Learning & Data Science Nature Language Tech Research

Google & IDSIA’s Block-Recurrent Transformer Dramatically Outperforms Transformers Over Very Long Sequences

A team from Google Research and the Swiss AI Lab IDSIA proposes the Block-Recurrent Transformer, a novel long-sequence processing approach that has the same computation time and parameter count costs as a conventional transformer layer but achieves significant perplexity improvements in language modelling tasks over very long sequences.

AI Machine Learning & Data Science Nature Language Tech Research

Sapienza U & OpenAI Propose Explanatory Learning to Enable Machines to Understand and Create Explanations

A research team from Sapienza University and OpenAI introduces an explanatory learning procedure that enables machines to understand existing explanations from symbolic sequences and create new explanations for unexplained phenomena, and further proposes Critical Rationalist Network (CRN) models for discovering explanations for novel phenomena.

AI Machine Learning & Data Science Nature Language Tech Research

MIT’s Automatic Data-Driven Media Bias Measurement Method Achieves Human-Level Results

MIT researchers present an automated, objective and transparent data-driven method for measuring media bias. The study analyses roughly a million articles from about a hundred newspapers for bias on various news topics, maps the newspapers into a two-dimensional media bias landscape, and shows that the data-driven results agree well with human-judgement classifications.

AI Machine Learning & Data Science Nature Language Tech Popular Research

Google Researchers Enable Transformers to Solve Compositional NLP Tasks

A Google Research team explores the design space of Transformer models in an effort to enable deep learning architectures to solve compositional tasks. The proposed approach provides models with inductive biases via design decisions that significantly impact compositional generalization, and achieves state-of-the-art results on the COGS and PCFG composition benchmarks.

AI Machine Learning & Data Science Nature Language Tech Research

Google’s H-Transformer-1D: Fast One-Dimensional Hierarchical Attention With Linear Complexity for Long Sequence Processing

A Google Research team draws inspiration from two numerical analysis methods — Hierarchical Matrix (H-Matrix) and Multigrid — to address the quadratic complexity problem of attention mechanisms in transformer architectures, proposing a hierarchical attention scheme that has linear complexity in run time and memory.

AI Machine Learning & Data Science Nature Language Tech Research

Melbourne U, Facebook & Twitter Expose Novel Numerical Errors in NMT Systems

A research team from the University of Melbourne, Facebook AI, and Twitter Cortex proposes a black-box test method for assessing and debugging the numerical translation of neural machine translation systems in a systematic manner. The approach reveals novel types of errors that are general across multiple state-of-the-art translation systems.

AI Machine Learning & Data Science Research

Baidu’s Knowledge-Enhanced ERNIE 3.0 Pretraining Framework Delivers SOTA NLP Results, Surpasses Human Performance on the SuperGLUE Benchmark

A research team from Baidu proposes ERNIE 3.0, a unified framework for pretraining large-scale, knowledge-enhanced models that can easily be tailored for both natural language understanding and generation tasks with zero-shot learning, few-shot learning or fine-tuning, and achieves state-of-the-art results on NLP tasks.

AI Machine Learning & Data Science Nature Language Tech Research

Study Shows Transformers Possess the Compositionality Power for Mathematical Reasoning

A research team from UC Davis, Microsoft Research and Johns Hopkins University extends work on training massive amounts of linguistic data to reveal the grammatical structures in their representations to the domain of mathematical reasoning, showing that both the standard transformer and the TP-Transformer can compose the meanings of mathematical symbols based on their structured relationships.

AI Machine Learning & Data Science Research Share My Research

UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual Embeddings Using the Unified Medical Language System Metathesaurus

UmlsBERT is a deep Transformer network architecture that incorporates clinical domain knowledge from a clinical Metathesaurus in order to build ‘semantically enriched’ contextual representations that will benefit from both the contextual learning and domain knowledge.