Artificial Intelligence | Synced

by Synced 2021-06-14 1

Google Researchers Merge Pretrained Teacher LMs Into a Single Multilingual Student LM Via Knowledge Distillation

A Google Research team proposes MergeDistill, a framework for merging pretrained teacher LMs from multiple monolingual/multilingual LMs into a single multilingual task-agnostic student LM to leverage the capabilities of the powerful language-specific LMs while still being multilingual and enabling positive language transfer.

by Synced 2021-06-11 3

AI Machine Learning & Data Science Popular Research

Yoshua Bengio Team Designs Consciousness-Inspired Planning Agent for Model-Based RL

A research team from McGill University, Université de Montréal, DeepMind and Mila presents an end-to-end, model-based deep reinforcement learning (RL) agent that dynamically attends to relevant parts of its environments to facilitate out-of-distribution (OOD) and systematic generalization.

by Synced 2021-06-10 2

AI Machine Learning & Data Science Research

IEEE Publishes Comprehensive Survey of Bottom-Up and Top-Down Neural Processing System Design

An IEEE team provides a comprehensive overview of the bottom-up and top-down design approaches toward neuromorphic intelligence, highlighting the different levels of granularity present in existing silicon implementations and assessing the benefits of the different circuit design styles in neural processing systems.

by Synced 2021-06-09 2

AI Machine Learning & Data Science Research

Pieter Abbeel Team’s Decision Transformer Abstracts RL as Sequence Modelling

A research team from UC Berkeley, Facebook AI Research and Google Brain abstracts Reinforcement Learning (RL) as a sequence modelling problem. Their proposed Decision Transformer simply outputs optimal actions by leveraging a causally masked transformer, yet matches or exceeds state-of-the-art model-free offline RL baselines on Atari, OpenAI Gym, and Key-to-Door tasks.

by Synced 2021-06-08 2

AI Machine Learning & Data Science Research

What Matters in Adversarial Imitation Learning? Google Brain Study Reveals Valuable Insights

A research team from Google Brain conducts a comprehensive empirical study on more than fifty choices in a generic adversarial imitation learning framework and explores their impacts on large-scale (>500k trained agents) continuous-control tasks to provide practical insights and recommendations for designing novel and effective AIL algorithms.

by Synced 2021-06-07 2

AI Machine Learning & Data Science Popular Research

Google Proposes Efficient and Modular Implicit Differentiation for Optimization Problems

A research team from Google Research combines the benefits of implicit differentiation and autodiff and proposes a unified, efficient and modular approach for implicit differentiation of optimization problems.

by Synced 2021-06-04 5

AI Machine Learning & Data Science Research

Microsoft & OneFlow Leverage the Efficient Coding Principle to Design Unsupervised DNN Structure-Learning That Outperforms Human-Designed Structures

A research team from OneFlow and Microsoft takes a step toward automatic deep neural network structure design, exploring unsupervised structure-learning and leveraging the efficient coding principle, information theory and computational neuroscience to design structure learning without label information.

by Synced 2021-06-03 2

AI Machine Learning & Data Science Nature Language Tech Research

Towards a Token-Free Future: Google Proposes Pretrained Byte-to-Byte Transformers for NLP

A research team from Google proposes ByT5 architecture, a competitive token-free pretrained byte-to-byte transformer that can be straightforwardly adapted to process byte sequences without adding excessive computational cost.

by Synced 2021-06-02 1

AI Computer Vision & Graphics Machine Learning & Data Science Research

Google & Rutgers’ Aggregating Nested Transformers Yield Better Accuracy, Data Efficiency and Convergence

A research team from Google Cloud AI, Google Research and Rutgers University simplifies vision transformers’ complex design, proposing nested transformers (NesT) that simply stack basic transformer layers to process non-overlapping image blocks individually. The approach achieves superior ImageNet classification accuracy and improves model training efficiency.

by Synced 2021-06-01 1

AI Machine Learning & Data Science Research

Georgia Tech & Microsoft Reveal ‘Super Tickets’ in Pretrained Language Models: Improving Model Compression and Generalization

A research team from Georgia Tech, Microsoft Research and Microsoft Azure AI studies the collections of “lottery tickets” in extremely over-parametrized models, revealing the generalization performance pattern of winning tickets and proving the existence of “super tickets.”

by Synced 2021-05-31 1

AI Machine Learning & Data Science Research

NYU, Facebook & CIFAR Present ‘True Few-Shot Learning’ for Language Models Whose Few-Shot Ability They Say Is Overestimated

A research team from New York University, Facebook AI, and a CIFAR Fellow in Learning in Machines & Brains raise doubts regarding large-scale pretrained language models’ few-shot learning abilities. The researchers re-evaluate such abilities with held-out examples unavailable, which they propose constitutes “true few-shot learning.”

by Synced 2021-05-28 2

AI Machine Learning & Data Science Research

New IEEE Research Equips Gradient Descent with Angular Information to Boost DNN Training

An IEEE team proposes AngularGrad — a novel optimization algorithm that takes both gradient direction and angular information into consideration. The method successfully reduces the zig-zag effect in the optimization trajectory and speeds up convergence.

by Synced 2021-05-27 5

AI Machine Learning & Data Science Popular Research

Cornell & NTT’s Physical Neural Networks: a “Radical Alternative for Implementing Deep Neural Networks” That Enables Arbitrary Physical Systems Training

A team from Cornell University and NTT Research proposes Physical Neural Networks (PNNs), a universal framework that leverages a backpropagation algorithm to train arbitrary, real physical systems to execute deep neural networks.

by Synced 2021-05-26 2

AI Machine Learning & Data Science Nature Language Tech Research

Study Shows Transformers Possess the Compositionality Power for Mathematical Reasoning

A research team from UC Davis, Microsoft Research and Johns Hopkins University extends work on training massive amounts of linguistic data to reveal the grammatical structures in their representations to the domain of mathematical reasoning, showing that both the standard transformer and the TP-Transformer can compose the meanings of mathematical symbols based on their structured relationships.

by Synced 2021-05-25 2

AI Machine Learning & Data Science Research

Yoshua Bengio Team’s Recurrent Independent Mechanisms Endow RL Agents With Out-of-Distribution Adaptation and Generalization Abilities

A research team from the University of Montreal and Max Planck Institute for Intelligent Systems constructs a reinforcement learning agent whose knowledge and reward function can be reused across tasks, along with an attention mechanism that dynamically selects unchangeable knowledge pieces to enable out-of-distribution adaptation and generalization.

by Synced 2021-05-24 3

AI Asia Global News

IoT Cloud Company Tuya Smart Holds Meeting on Fast-tracked Connectivity and Innovation Amid COVID-19 Pandemic

On May 21, leading IoT cloud platform Tuya Smart hosted a panel to discuss resilience and innovation for the IoT industry in North America during COVID-19 outbreak.

by Synced 2021-05-21 4

AI Machine Learning & Data Science Research

ETH Zürich & Microsoft Study: Demystifying Serverless ML Training

A research team from ETH Zürich and Microsoft presents a systematic, comparative study of distributed ML training over serverless infrastructures (FaaS) and “serverful” infrastructures (IaaS), aiming to understand the system tradeoffs of distributed ML training with serverless infrastructures.

by Synced 2021-05-20 2

AI Machine Learning & Data Science Popular Research

ETH Zürich Identifies Priors That Boost Bayesian Deep Learning Models

A research team from ETH Zürich presents an overview of priors for (deep) Gaussian processes, variational autoencoders and Bayesian neural networks. The researchers propose that well-chosen priors can achieve theoretical and empirical properties such as uncertainty estimation, model selection and optimal decision support; and provide guidance on how to choose them.

by Synced 2021-05-19 7

AI Computer Vision & Graphics Research

Intelligent Graphic Design: Adobe’s Directional GAN Automates Image Content Generation for Marketing Campaigns

A research team from Adobe proposes Directional GAN (DGAN), a novel and simple approach for generating high-resolution images conditioned on expected semantic attributes, greatly simplifying the image content generating process for marketing campaigns, websites and banners.

by Synced 2021-05-18 2

AI Machine Learning & Data Science Research

Facebook Transfer Learning Method Boosts Code Autocompletion Accuracy by Over 50%

A research team from Facebook shows how the power of transfer learning can enable pretraining on non-IDE, non-autocompletion and different-language example code sequences before fine-tuning on the autocompletion prediction task to improve model accuracy by over 50 percent on very small fine-tuning datasets and over 10 percent on 50k labelled examples.

by Synced 2021-05-17 0

AI Machine Learning & Data Science Research

Google Presents New Parallelization Paradigm GSPMD for common ML Computation Graphs: Constant Compilation time with Increasing Devices

A research team from Google proposes GSPMD, an automatic parallelism system for ML computation graphs that uses simple tensor sharding annotations to achieve different parallelism paradigms in a unified way, including data parallelism, within-layer model parallelism, spatial partitioning, weight-update sharding, optimizer-state sharding and pipeline parallelism.

by Synced 2021-05-14 9

AI Machine Learning & Data Science Popular Research

Google Replaces BERT Self-Attention with Fourier Transform: 92% Accuracy, 7 Times Faster on GPUs

A research team from Google shows that replacing transformers’ self-attention sublayers with Fourier Transform achieves 92 percent of BERT accuracy on the GLUE benchmark with training times seven times faster on GPUs and twice as fast on TPUs.

by Synced 2021-05-13 3

AI Machine Learning & Data Science Research

DeepMind Presents Neural Algorithmic Reasoning: The Art of Fusing Neural Networks With Algorithmic Computation

A research team from DeepMind explores how neural networks can be fused with algorithmic computation and demonstrates an elegant neural end-to-end pipeline that goes straight from raw inputs to general outputs while emulating an algorithm internally.

by Synced 2021-05-12 1

AI Machine Learning & Data Science Research

DeepMind & Onshape Leverage Transformer to Automatize Effective CAD Sketches

A research team from DeepMind and Onshape combines a general-purpose language modelling technique and an off-the-shelf data serialization protocol to propose a machine learning model that can automatically generate high-quality sketches for Computer-Aided Design.

by Synced 2021-05-11 3

AI Machine Learning & Data Science Research

ETH Zurich Proposes a Robotic System Capable of Self-Improving Its Semantic Perception

A research team from ETH Zurich combines continual learning and self-supervision to propose a novel robot system that enables online life-long self-supervised learning of semantic scene understanding.

by Synced 2021-05-10 2

AI Machine Learning & Data Science Research

Imperial College London Proposes Optimal Training of Variational Quantum Algorithms Without Barren Plateaus

Imperial College London researchers show how to optimally train a variational quantum algorithm to represent quantum states and propose a stable variant of the quantum natural gradient, a generalized quantum natural gradient that can be trained free of barren plateaus.

by Synced 2021-05-07 4

AI Machine Learning & Data Science Research

MIT & IBM ‘Curiosity’ Framework Explores Embodied Environments to Learn Task-Agnostic Visual Representations

A research team from MIT and MIT-IBM Watson AI Lab proposes Curious Representation Learning (CRL), a framework that learns to understand the surrounding environment by training a reinforcement learning (RL) agent to maximize the error of a representation learner to gain an incentive to explore the environment.

by Synced 2021-05-06 3

AI Machine Learning & Data Science Research

Facebook AI Conducts Large-Scale Study on Unsupervised Spatiotemporal Representation Learning

A research team from Facebook AI conducts a large-scale study on unsupervised spatiotemporal representation learning from videos. The work takes a unified perspective on four recent image-based frameworks (MoCo, SimCLR, BYOL, SwAV) and investigates a simple objective that can easily generalize unsupervised representation learning methodologies to space-time.

by Synced 2021-05-05 3

AI Machine Learning & Data Science Popular Research

Bronstein, Bruna, Cohen and Velickovic Leverage the Erlangen Programme to Establish the Geometric Foundations of Deep Learning

Twitter Chief Scientist Michael Bronstein, Joan Bruna from New York University, Taco Cohen from Qualcomm AI and Petar Veličković from DeepMind publish a paper that aims to geometrically unify the typical architectures of CNNs, GNNs, LSTMs, Transformers, etc. from the perspective of symmetry and invariance to build an “Erlangen Programme” for deep neural networks.

by Synced 2021-05-04 2

AI Machine Learning & Data Science Research

Huawei & Tsinghua U Method Boosts Task-Agnostic BERT Distillation Efficiency by Reusing Teacher Model Parameters

A research team from Huawei Noah’s Ark Lab and Tsinghua University proposes Extract Then Distill (ETD), a generic and flexible strategy for reusing teacher model parameters for efficient and effective task-agnostic distillation that can be applied to student models of any size.

by Synced 2021-05-03 4

AI Machine Learning & Data Science Research

CMU, UT Austin & Facebook’s CNN Layer Width Optimization Strategies Achieve 320x Overhead Reduction

Researchers from Carnegie Mellon University, the University of Texas at Austin and Facebook AI propose a novel paradigm to optimize widths for each CNN layer. The method is compatible across various width optimization algorithms and networks and achieves up to a 320x reduction in width optimization overhead without compromising top-1 accuracy on ImageNet.

by Synced 2021-04-30 2

AI Computer Vision & Graphics Machine Learning & Data Science Research

Yann LeCun Team’s Novel End-to-End Modulated Detector Captures Visual Concepts in Free-Form Text

A research team from NYU and Facebook proposes MDETR, an end-to-end modulated detector that identifies objects in images conditioned on a raw text query and is able to capture a long tail of visual concepts expressed in free-form text.

by Synced 2021-04-29 5

AI Machine Learning & Data Science Popular Research

Toward a New Generation of Neuromorphic Computing: IBM & ETH Zurich’s Biologically Inspired Optimizer Boosts FCNN and SNN Training

IBM and ETH Zurich researchers make progress in reconciling neurophysiological insights with machine intelligence, proposing a novel biologically inspired optimizer for artificial (ANNs) and spiking neural networks (SNNs) that incorporates synaptic integration principles from biology. GRAPES (Group Responsibility for Adjusting the Propagation of Error Signals) leads to improvements in the training time convergence, accuracy and scalability of ANNs and SNNs.

by Synced 2021-04-28 3

AI Machine Learning & Data Science Research

Google’s 1.3 MiB On-Device Model Brings High-Performance Disfluency Detection Down to Size

A research team from Google Research proposes small, fast, on-device disfluency detection models based on the BERT architecture. The smallest model size is only 1.3 MiB, representing a size reduction of two orders of magnitude and an inference latency reduction of a factor of eight compared to state-of-the-art BERT-based models.

by Synced 2021-04-27 2

AI Machine Learning & Data Science Nature Language Tech Research

Microsoft & Peking U Researchers Identify ‘Knowledge Neurons’ in Pretrained Transformers, Enabling Fact Editing

A research team from Microsoft Research and Peking University peeps into pretrained transformers and investigates how factual knowledge is stored, proposing a method to identify “knowledge neurons,” which can be utilized to explicitly update and erase facts.

by Synced 2021-04-26 2

AI Machine Learning & Data Science Research

Google and UC Berkeley Propose Green Strategies for Large Neural Network Training

A research team from Google and the University of California, Berkeley calculates the energy use and carbon footprint of large-scale models T5, Meena, GShard, Switch Transformer and GPT-3, and identifies methods and publication guidelines that could help reduce their CO2e footprint.

by Synced 2021-04-23 2

AI Machine Learning & Data Science Nature Language Tech Research

Facebook AI, McGill U & Mila Promote ‘Translationese’ to Boost NMT System Faithfulness

A research team from McGill University, Mila – Quebec AI Institute and Facebook AI proposes novel metrics and perturbation functions to detect, quantify and compare trade-offs between robustness and faithfulness in NMT systems, both on the corpus level and with particular examples.

by Synced 2021-04-22 2

AI Nature Language Tech Research

Are Multilingual Language Models Fragile? IBM Adversarial Attack Strategies Cut MBERT QA Performance by 85%

An IBM research team proposes four multilingual adversarial attack strategies and attacks seven languages in a zero-shot setting on large multilingual pretrained language models (e.g. MBERT), reducing average performance by up to 85.6 percent.

by Synced 2021-04-21 3

AI Machine Learning & Data Science Popular Research

Pieter Abbeel Team Proposes Task-Agnostic RL Method to Auto-Tune Simulations to the Real World

A research team from UC Berkeley and Carnegie Mellon University proposes a task-agnostic reinforcement learning method that reduces the task-specific engineering required for domain randomization of both visual and dynamics parameters.

by Synced 2021-04-20 4

AI Machine Learning & Data Science Research

Rice University, IBM & USC Study Pushes Quantum State Tomography Beyond Current Computation Capabilities

A research team from Rice University, IBM and USC combine compressed sensing, non-convex optimization and acceleration techniques to introduce a new algorithm — Momentum Inspired Factored Gradient Descent (MiFGD) — that pushes QST beyond current capabilities.