Tag: Research

AI Machine Learning & Data Science Research

Microsoft & Xiamen U’s Progressive Distillation Method Sets New SOTA for Dense Retrieval

In the new paper Progressive Distillation for Dense Retrieval, a research team from Xiamen U and Microsoft Research presents PROD, a progressive distillation method for dense retrieval that achieves state-of-the-art performance on five widely used benchmarks.

AI Machine Learning & Data Science Research

DeepMind, Oxford U, IDSIA, Mila & Purdue U’s General Neural Algorithmic Learner Matches Task-Specific Expert Performance

In the new paper A Generalist Neural Algorithmic Learner, a research team from DeepMind, University of Oxford, IDSIA, Mila, and Purdue University presents a novel generalist neural algorithmic learner — a single graph neural network (GNN) capable of solving various classical algorithms at single-task expert level.

AI Machine Learning & Data Science Research

Transformers on Edge Devices? Monash U’s Energy-Saving Attention With Linear Complexity Reduces Compute Cost by 73%

In the new paper EcoFormer: Energy-Saving Attention with Linear Complexity, a Monash University research team presents EcoFormer, an attention mechanism with linear complexity that replaces expensive multiply-accumulate operations with simple accumulations and achieves a 73 percent energy footprint reduction on ImageNet.

AI Machine Learning & Data Science Nature Language Tech Research

Google Brain’s Vec2Text Models for Sentence Generation Excel in Universality, Diversity, Fluency & Semantic Structure

In the new paper Vec2text With Round-Trip Translations, Google Brain researchers explore large language models’ capabilities for generating arbitrary natural language text from inputs of fixed-size vectors — a vec2text setting — and propose a simple data augmentation approach based on round-trip translations to improve vec2text model performance.

AI Machine Learning & Data Science Research

DeepMind’s ‘Expert-Aware’ Data Augmentation Technique Enables Data-Efficient Learning from Parametric Experts

The new DeepMind paper Data Augmentation for Efficient Learning from Parametric Experts proposes Augmented Policy Cloning (APC), a simple yet effective data-augmentation approach designed to support data-efficient learning from parametric experts. The method significantly improves data efficiency across various control and reinforcement learning settings.

AI Machine Learning & Data Science Nature Language Tech Research

Peking U & Microsoft’s Knowledge Attribution Method Enables Editing Factual Knowledge in Pretrained Transformers Without Fine-Tuning

In the new paper Knowledge Neurons in Pretrained Transformers, a research team from Peking University and Microsoft Research introduces a knowledge attribution method that identifies the neurons that store factual knowledge in pretrained transformers and leverages these neurons to edit factual knowledge in transformers without any fine-tuning.

AI Machine Learning & Data Science Research

DeepMind’s Model-Based Offline Options Framework Supports Automatic Skill & Behaviour Discovery, Boosts Transfer Capabilities

In the new paper MO2: Model-Based Offline Options, a DeepMind research team introduces Model-Based Offline Options (MO2), an offline hindsight bottleneck options framework that supports sample-efficient option discovery over continuous state-action spaces for efficient skill transfer to new tasks.

AI Machine Learning & Data Science Research

Toward a Turing Machine? Microsoft & Harvard Propose Neural Networks That Discover Learning Algorithms Themselves

A research team from Microsoft and Harvard University demonstrates that neural networks can discover succinct learning algorithms on their own in polynomial time and presents an architecture that combines recurrent weight-sharing between layers and convolutional weight-sharing to reduce parameter size from even trillions of nodes down to a constant.

AI Machine Learning & Data Science Research

Meta AI & Inria Saclay Advance BCIs to Enable Natural Speech Decoding From Non-Invasive Brain Recordings

In the new paper Decoding Speech From Non-Invasive Brain Recordings, a research team from Meta AI and the Inria Saclay Centre presents a single end-to-end architecture for decoding natural speech processing from non-invasive magnetoencephalography (MEG) or electroencephalography (EEG) brain recordings that can detect macroscopic brain signals in real-time.

AI Machine Learning & Data Science Nature Language Tech Research

Plan, Edit, Explain and Repeat: The PEER Collaborative Language Model Brings a Humanlike Process to Text Generation

In the new paper PEER: A Collaborative Language Model, a research team from Meta AI, Carnegie Mellon University, PSL University, and University College London presents PEER, a collaborative language model that performs a humanlike writing process — composing drafts, adding suggestions, proposing edits and providing explanations for its actions.

AI Computer Vision & Graphics Machine Learning & Data Science Research

Princeton U & Adobe’s 3D-FM GAN Enables Precise 3D-Controllable Face Manipulation

In the new paper 3D-FM GAN: Towards 3D-Controllable Face Manipulation, a team from Princeton University and Adobe Research presents 3D-FM GAN, a novel conditional GAN framework that enables precise 3D-controllable face manipulation with high photorealism and strong identity preservation without requiring any manual tuning or optimizations.

AI Computer Vision & Graphics Machine Learning & Data Science Popular Research

Microsoft’s BEiT-3 Foundation Model: A ‘Big Convergence of Language, Vision, and Multimodal Pretraining’ That Achieves SOTA Results on Popular Benchmarks

In the new paper Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks, a Microsoft research team presents BEiT-3, a general-purpose state-of-the-art multimodal foundation model for both vision and vision-language tasks that advances the big convergence of backbone architectures, pretraining tasks, and model scaling.

AI Machine Learning & Data Science Nature Language Tech Research

CMU Details 6 Years of Contributions to the National Science Foundation- Funded DialPort Project for Dialog Research

Carnegie Mellon University researchers provide background information and details on contributions to the DialPort project over the last six years in their new paper The DialPort Tools. These tools — such as the DialPort Portal and DialCrowd — will be demoed at the SIGDIAL 2022 conference next month in Edinburgh.

AI Machine Learning & Data Science Nature Language Tech Research

Microsoft’s Parameter-Efficient Z-Code++ Language Model Beats the 200x Larger GPT3-175B on Abstractive Text Summarization

In the new paper Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization, a research team from Microsoft Azure AI and Microsoft Research presents Z-Code++, a novel encoder-decoder pretrained language model optimized for abstractive summarization that significantly improves performance on low-resource summarization tasks.

AI Computer Vision & Graphics Machine Learning & Data Science Research

Adobe and ANU’s Paint2Pix: Intent-Accurate Image Synthesis from Simple Brushstroke Inputs

In the new paper Paint2Pix: Interactive Painting based Progressive Image Synthesis and Editing, a research team from Adobe Research and Australian National University presents paint2pix, a novel model that learns to predict users’ intentions and produce photorealistic images from primitive and coarse human brushstroke inputs.

AI Machine Learning & Data Science Research

Microsoft, Penn U & UC San Diego’s TiCoder Framework Generates Code With 90.4% Consistency to User Intent

In the new paper Interactive Code Generation via Test-Driven User-Intent Formalization, a team from Microsoft Research, the University of Pennsylvania, and the University of California, San Diego proposes a workflow for test-driven user-intent formalization that leverages user feedback to generate code that is 90.40 percent consistent with user intent.

AI Machine Learning & Data Science Research

Georgia Tech & Google Propose a Novel Discrete Variational Autoencoder for Automatically Improving Code Efficiency

In the new paper Learning to Improve Code Efficiency, a research team from the Georgia Institute of Technology and Google Research presents a novel discrete generative latent-variable model designed to help programmers identify more computationally efficient code variants, taking a step toward automating the process of code performance optimization.

AI Machine Learning & Data Science Nature Language Tech Research

Meet Atlas: A Pretrained Retrieval Augmented Language Model That Outperforms a 540B Parameter Model But Requires 50x Fewer Parameters

In the new paper Few-shot Learning With Retrieval Augmented Language Models, a research team from Meta AI, PSL University, Inria, and University College London presents Atlas, a pretrained retrieval augmented language model that effectively learns new knowledge-intensive tasks under few-shot settings. Atlas outperforms the 540B parameter PaLM model on QA tasks while using 50x fewer parameters.

AI Machine Learning & Data Science Research

Meta AI & Mila Publicly Release BlenderBot 3: A 175B SOTA Chatbot That Continually Improves via Human Interactions

In the new paper BlenderBot 3: A Deployed Conversational Agent That Continually Learns to Responsibly Engage, researchers from Meta AI and Mila/McGill University release BlenderBot 3, a 175B parameter state-of-the-art open-domain dialogue model deployed on a public website. BlenderBot 3 is designed for continual learning via its user interactions.

AI Machine Learning & Data Science Research

Microsoft & Arizona U’s TextWorldExpress Simulates Text Games at 1M SPS, a Speedup of 3 Orders of Magnitude

In the new paper TextWorldExpress: Simulating Text Games at One Million Steps Per Second, a research team from the University of Arizona and Microsoft Research Montréal presents TextWorldExpress, a high-performance text-game simulator that boosts throughput by approximately three orders of magnitude, reaching one million steps per second.

AI Machine Learning & Data Science Research

OpenAI Presents a Simple and Efficient Training Strategy to Boost Language Models’ Text-Infilling Capabilities

In the new paper Efficient Training of Language Models to Fill in the Middle, an OpenAI research team shows that causal decoder-based autoregressive (AR) language models can learn to infill texts via a very simple and straightforward transformation to the training data and without any architectural modifications.

AI Computer Vision & Graphics Machine Learning & Data Science Research

IITM & UT Austin’s Generalizable NeRF Transformer Demonstrates Transformers’ Capabilities for Graphical Rendering

In the new paper Is Attention All NeRF Needs?, a research team from the Indian Institute of Technology Madras and the University of Texas at Austin proposes Generalizable NeRF Transformer (GNT), a pure and universal transformer-based architecture for efficient on-the-fly reconstruction of NeRFs. The work demonstrates that a pure attention mechanism suffices for learning a physically-grounded rendering process.

AI Machine Learning & Data Science Nature Language Tech Research

Fancy a Friendly Chat? Stanford NLP’s Chirpy Cardinal Enables Open-Domain and Humanlike Conversations

In the new paper Neural Generation Meets Real People: Building a Social, Informative Open-Domain Dialogue Agent, a Stanford NLP research team presents Chirpy Cardinal, an open-domain conversational social chatbot with emotional and social intelligence that enables authentic and engaging interactions with real people.

AI Machine Learning & Data Science Research

Google & DeepMind Study the Interactions Between Scaling Laws and Neural Network Architectures

In the new paper Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling?, a research team from Google and DeepMind posits that understanding the connections between neural network architectures and scaling laws is essential for designing and evaluating new models. The team pretrains and finetunes over 100 models to reveal useful insights on the scaling behaviours of ten diverse model architectures.

AI Machine Learning & Data Science Research

DeepMind & UCL’s Stochastic MuZero Achieves SOTA Results in Complex Stochastic Environments

In the new paper Planning in Stochastic Environments with a Learned Model, a research team from DeepMind and University College London extends the deterministic MuZero model to Stochastic MuZero for stochastic model learning, achieving performance comparable or superior to state-of-the-art methods in complex single- and multi-agent environments.

AI Machine Learning & Data Science Research

SYSU and UBTECH Propose Big Learning for Justifying, Analyzing, and Improving Foundation Models

A research team from Sun Yat-sen University and UBTECH proposes a unified approach for justifying, analyzing, and improving foundation models in the new paper Big Learning: A Universal Machine Learning Paradigm? The team’s big learning framework can model many-to-all joint/conditional/marginal data distributions and delivers extraordinary data and task flexibilities.