Google Brain Sets New Semi-Supervised Learning SOTA in Speech Recognition
Google Brain has improved the SOTA on the LibriSpeech automatic speech recognition task, with their score of 1.4 percent/ 2.6 percent word-error-rates.
AI Technology & Industry Review
Google Brain has improved the SOTA on the LibriSpeech automatic speech recognition task, with their score of 1.4 percent/ 2.6 percent word-error-rates.
Augmented Temporal Contrast (ATC), a new unsupervised learning (UL) task for learning visual representations agnostic to rewards and without degrading the control policy.
DeepMind unveiled a partnership with Google Maps that has leveraged advanced GNNs to improve ETA accuracy.
AMBERT (A Multigrained BERT) leverages both fine-grained and coarse-grained tokenizations to achieve SOTA performance on English and Chinese language tasks.
BigBird is shown to dramatically improve performance across long-context NLP tasks, producing SOTA results in question answering and summarization.
The researchers say the approach produces motions that are visually and physically much more plausible than state-of-the-art methods.
Following on the February release of its contrastive learning framework SimCLR, the same team of Google Brain researchers guided by Turing Award honouree Dr. Geoffrey Hinton has presented SimCLRv2, an upgraded approach that boosts the SOTA results by 21.6 percent.
Researchers from Katholieke Universiteit Leuven in Belgium and ETH Zürich in a recent paper propose a two-step approach for unsupervised classification.
OpenAI announced the upgraded GPT-3 with a whopping 175 billion parameters.
Enter Plan2Explore — a self-supervised RL agent designed to quickly generalize to unseen tasks in a zero or few-shot manner.
A Google-led research team has introduced a new method for optimizing neural network parameters that is faster than all common first-order methods on complex problems.
MonoLayout, a practical deep neural architecture that takes just a single image of a road scene as input and outputs an amodal scene layout in bird’s-eye view.
Researchers have proposed a simple but powerful “SimCLR” framework for contrastive learning of visual representations.
The tool enables researchers to try, compare, and evaluate models to decide which work best on their datasets or for their research purposes.
A recent paper published by Microsoft researchers proposes a new vision-language pretrained model for image-text joint embedding, ImageBERT, which which achieves SOTA performance on both the MSCOCO and Flickr30k datasets.
Researchers from Beijing’s National Laboratory of Pattern Recognition (NLPR), SenseTime Research, and Nanyang Technological University have taken the tech one step further with a new framework that enables totally arbitrary audio-video translation.
A group of researchers from The Katholieke Universiteit Leuven and The Technical University of Berlin recently introduced a Dutch RoBERTa-based language model, RobBERT.
A team from Google Research introduced FixMatch, an algorithm that combines two common SSL methods for deep networks: pseudo-labeling (aka self-training) and consistency regularization.
Facebook AI Research team has introduced a new “point-based rendering” neural network module with an iterative subdivision algorithm that can integrate SOTA image segmentation models.
Every Friday Synced selects seven recent studies that present topical, innovative or otherwise interesting or important research we believe may be of interest to our readers.
ERNIE has achieved new state-of-the-art performance on GLUE and become the world’s first model to score over 90 in terms of the macro-average score (90.1).
Now, a team from Facebook AI Research, Inria, and Sorbonne Université have released CamemBERT, essentially a French version of Google AI’s game-changing pretrained language model BERT.
It’s not as easy as one might imagine to train an AI model to accurately predict what a human will do next, even when they are interacting with a relatively simple object like a ball.
Synced surveyed last week’s crop of machine learning papers and identified seven that we believe may be of special interest to our readers.
NLP-focused startup Hugging Face recently released a major update to their popular “PyTorch Transformers” library which establishes compatibility between PyTorch and TensorFlow 2.0.
Hugging Face, a startup specializing in natural language processing, today released a landmark update to their popular Transformers library, offering unprecedented compatibility between two major deep learning frameworks, PyTorch and TensorFlow 2.0.
A group of Chinese researchers have come up with a novel method for identifying mirrors in images which outperforms state-of-the-art detection and segmentation methods on targeted baselines.
A group of MIT researchers (Han Cai, Chuang Gan and Song Han) have introduced a “Once for All” (OFA) network that achieves the same or better level accuracy as state-of-the-art AutoML methods on ImageNet, with a significant speedup in training time.
Researchers from Lomonosov Moscow State University and Huawei Moscow Research Center have introduced a wearable card designed to perform the opposite function — concealing a person’s identity from facial recognition systems.
Current state-of-the-art convolutional architectures for object detection tasks are human-designed. In a recent paper, Google Brain researchers leveraged the advantages of Neural Architecture Search (NAS) to propose NAS-FPN, a new automatic search method for feature pyramid architecture.
In its new paper Fast, Accurate and Lightweight Super-Resolution with Neural Architecture Search, Xiaomi’s research team introduces a deep convolution neural network (CNN) model using a neural architecture search (NAS) approach. Performance is comparable to cutting-edge models such as CARN and CARN-M.