A research team from Google and the University of California, Berkeley calculates the energy use and carbon footprint of large-scale models T5, Meena, GShard, Switch Transformer and GPT-3, and identifies methods and publication guidelines that could help reduce their CO2e footprint.
A research team from McGill University, Mila – Quebec AI Institute and Facebook AI proposes novel metrics and perturbation functions to detect, quantify and compare trade-offs between robustness and faithfulness in NMT systems, both on the corpus level and with particular examples.
A research team from ETH Zurich leverages existing spike-based learning circuits to propose a biologically plausible architecture that is highly successful in classifying distinct and complex spatio-temporal spike patterns. The work contributes to the design of ultra-low-power mixed-signal neuromorphic processing systems capable of distinguishing spatio-temporal patterns in spiking activity.
A research team from NVIDIA, Stanford University and Microsoft Research propose a novel pipeline parallelism approach that improves throughput by more than 10 percent with a comparable memory footprint, showing such strategies can achieve high aggregate throughput while training models with up to a trillion parameters.
A research team from ETH and UC Berkeley proposes a Deep Reward Learning by Simulating the Past (Deep RLSP) algorithm that represents rewards directly as a linear combination of features learned through self-supervised representation learning and enables agents to simulate human actions backwards in time to infer what they must have done.
A research team from IBM introduces two systems for predicting information type: The TypeSuggest module, an unsupervised system designed to generate types for a set of seed query terms input by the user; and an Answer Type prediction module for predicting the correct answer type for user-provided questions.
A research team from Technical University of Munich, Google, Nvidia and LMU München proposes CodeTrans, an encoder-decoder transformer model which achieves state-of-the-art performance on six tasks in the software engineering domain, including Code Documentation Generation, Source Code Summarization, Code Comment Generation, etc.
A team from University of Michigan, MIT-IBM Watson AI Lab and ShanghaiTech University publishes two papers on individual fairness for ML models, introducing a scale-free and interpretable statistically principled approach for assessing individual fairness and a method for enforcing individual fairness in gradient boosting suitable for non-smooth ML models.
A research team from Princeton University and Microsoft Research discover autonomous language-understanding agents are capable of achieving high scores even in the complete absence of language semantics, indicating that current RL agents for text-based games might not be sufficiently leveraging the semantic structure of game texts.
A research team from DeepMind and Alberta University proposes Policy-guided Heuristic Search (PHS), a novel search algorithm that uses both a heuristic function and a policy while offering guarantees on the search loss that relate to both the quality of the heuristic and the policy.
A team from Max-Planck Institute for Intelligent Systems, ETH Zurich, Google Research Amsterdam, Mila and the University of Montreal make an effort to bring together causality and machine learning research programs, delineate implications of causality for machine learning and propose critical areas for future research.
University of Toronto researchers propose a BERT-inspired training approach as a self-supervised pretraining step to enable deep neural networks to leverage newly and publicly available massive EEG (electroencephalography) datasets for downstream brain-computer-interface (BCI) applications.
Stanford researchers’ DERL (Deep Evolutionary Reinforcement Learning) is a novel computational framework that enables AI agents to evolve morphologies and learn challenging locomotion and manipulation tasks in complex environments using only low level egocentric sensory information.
Researchers from the University of Sheffield, Beihang University, and Open University’s Knowledge Media Institute have proposed a transfer learning approach that can automatically process historical texts at a semantic level to generate modern language summaries.