Category: AI

Global machine intelligence updates.

AI Machine Learning & Data Science Research

Facebook Transfer Learning Method Boosts Code Autocompletion Accuracy by Over 50%

A research team from Facebook shows how the power of transfer learning can enable pretraining on non-IDE, non-autocompletion and different-language example code sequences before fine-tuning on the autocompletion prediction task to improve model accuracy by over 50 percent on very small fine-tuning datasets and over 10 percent on 50k labelled examples.

AI Machine Learning & Data Science Research

Google Presents New Parallelization Paradigm GSPMD for common ML Computation Graphs: Constant Compilation time with Increasing Devices

A research team from Google proposes GSPMD, an automatic parallelism system for ML computation graphs that uses simple tensor sharding annotations to achieve different parallelism paradigms in a unified way, including data parallelism, within-layer model parallelism, spatial partitioning, weight-update sharding, optimizer-state sharding and pipeline parallelism.

AI Machine Learning & Data Science Research

Facebook AI Conducts Large-Scale Study on Unsupervised Spatiotemporal Representation Learning

A research team from Facebook AI conducts a large-scale study on unsupervised spatiotemporal representation learning from videos. The work takes a unified perspective on four recent image-based frameworks (MoCo, SimCLR, BYOL, SwAV) and investigates a simple objective that can easily generalize unsupervised representation learning methodologies to space-time.

AI Machine Learning & Data Science Popular Research

Bronstein, Bruna, Cohen and Velickovic Leverage the Erlangen Programme to Establish the Geometric Foundations of Deep Learning

Twitter Chief Scientist Michael Bronstein, Joan Bruna from New York University, Taco Cohen from Qualcomm AI and Petar Veličković from DeepMind publish a paper that aims to geometrically unify the typical architectures of CNNs, GNNs, LSTMs, Transformers, etc. from the perspective of symmetry and invariance to build an “Erlangen Programme” for deep neural networks.

AI Machine Learning & Data Science Research

CMU, UT Austin & Facebook’s CNN Layer Width Optimization Strategies Achieve 320x Overhead Reduction

Researchers from Carnegie Mellon University, the University of Texas at Austin and Facebook AI propose a novel paradigm to optimize widths for each CNN layer. The method is compatible across various width optimization algorithms and networks and achieves up to a 320x reduction in width optimization overhead without compromising top-1 accuracy on ImageNet.

AI Machine Learning & Data Science Popular Research

Toward a New Generation of Neuromorphic Computing: IBM & ETH Zurich’s Biologically Inspired Optimizer Boosts FCNN and SNN Training

IBM and ETH Zurich researchers make progress in reconciling neurophysiological insights with machine intelligence, proposing a novel biologically inspired optimizer for artificial (ANNs) and spiking neural networks (SNNs) that incorporates synaptic integration principles from biology. GRAPES (Group Responsibility for Adjusting the Propagation of Error Signals) leads to improvements in the training time convergence, accuracy and scalability of ANNs and SNNs.

AI AIoT Machine Learning & Data Science Research

ETH Zurich Leverages Spiking Neural Networks To Build Ultra-Low-Power Neuromorphic Processors

A research team from ETH Zurich leverages existing spike-based learning circuits to propose a biologically plausible architecture that is highly successful in classifying distinct and complex spatio-temporal spike patterns. The work contributes to the design of ultra-low-power mixed-signal neuromorphic processing systems capable of distinguishing spatio-temporal patterns in spiking activity.

AI Machine Learning & Data Science Popular Research

NVIDIA, Stanford & Microsoft Propose Efficient Trillion-Parameter Language Model Training on GPU Clusters

A research team from NVIDIA, Stanford University and Microsoft Research propose a novel pipeline parallelism approach that improves throughput by more than 10 percent with a comparable memory footprint, showing such strategies can achieve high aggregate throughput while training models with up to a trillion parameters.

AI Machine Learning & Data Science Research

TUM, Google, Nvidia & LMU München’s CodeTrans Pretrained Models Crack Source Code Tasks With SOTA Performance

A research team from Technical University of Munich, Google, Nvidia and LMU München proposes CodeTrans, an encoder-decoder transformer model which achieves state-of-the-art performance on six tasks in the software engineering domain, including Code Documentation Generation, Source Code Summarization, Code Comment Generation, etc.