Reinforcement Learning

by Synced 2024-10-09 3

Scaling Multi-Objective Optimization: Meta & FAIR’s CGPO Advances General-purpose LLMs

In a new paper The Perfect Blend: Redefining RLHF with Mixture of Judges, a research team from Meta GenAI and FAIR developed Constrained Generative Policy Optimization (CGPO), which offers a more structured approach to RLHF, advancing the performance of general-purpose LLMs.

by Synced 2024-07-21 2

AI Machine Learning & Data Science Research

Stanford’s Hypothetical Minds: Revolutionizing Multi-Agent AI with Theory of Mind and Large Language Models

A Stanford University research team proposes Hypothetical Minds, builds on recent advancements in LLM-based agents designed for multi-agent environments, aiming to enhance adaptability in competitive, cooperative, and mixed-motive scenarios with concealed information.

by Synced 2024-06-25 4

AI Machine Learning & Data Science Research

Oxford U & DeepMind Harness Cultural Accumulation in Reinforcement Learning

In a new paper Artificial Generational Intelligence: Cultural Accumulation in Reinforcement Learning, a research team from the University of Oxford and Google DeepMind introduces methods to achieve cultural accumulation in Reinforcement Learning (RL) agents. This research opens new pathways for modeling human culture through artificial systems.

by Synced 2023-08-13 1

AI Machine Learning & Data Science Research

DeepMind’s AlphaStar Benchmark Improves RL Offline Agent With 90% Win Rate Against SOTA AlphaStar Supervised Agent

In a new paper AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning, a DeepMind research team presents AlphaStar Unplugged, an unprecedented challenging large-scale offline reinforcement learning benchmark that leverages a offline dataset from StarCraft II for RL agents training.

by Synced 2023-06-06 2

AI Machine Learning & Data Science Research

DeepMind, Mila & Montreal U’s Bigger, Better, Faster RL Agent Achieves Super-human Performance on Atari 100K

In a new paper Bigger, Better, Faster: Human-level Atari with human-level efficiency, a research team from Google DeepMind, Mila and Universite de Montreal presents a value-based RL agent, which they call faster, better, faster (BBF), that achieves super-human performance on the Atari 100K benchmark on single GPU.

by Synced 2023-01-25 9

AI Machine Learning & Data Science Research

Oxford U’s Deep Double Duelling Q-Learning Translates Trading Signals Into SOTA Trading Strategies

In the new paper Asynchronous Deep Double Duelling Q-Learning for Trading-Signal Execution in Limit Order Book Markets, an Oxford University research team introduces Deep Duelling Double Q-learning with the APEX architecture to train a trading agent to translate predictive signals into optimal limit order trading strategies.

by Synced 2022-11-02 2

AI Machine Learning & Data Science Research

Befuddling AI Go Systems: MIT, UC Berkeley & FAR AI’s Adversarial Policy Achieves a >99% Win Rate Against KataGo

In the new paper Adversarial Policies Beat Professional-Level Go AIs, a research team from MIT, UC Berkeley, and FAR AI employs a novel adversarial policy to attack the state-of-the-art AI Go system KataGo. The team believes theirs is the first successful end-to-end attack against an AI Go system playing at the level of a human professional.

by Synced 2022-09-21 127

AI Machine Learning & Data Science Research

DeepMind’s MEME Agent Achieves Human-level Atari Game Performance 200x Faster Than Agent57

In the new paper Human-level Atari 200x Faster, a DeepMind research team applies a set of diverse strategies to Agent57, with their resulting MEME (Efficient Memory-based Exploration) agent surpassing the human baseline on all 57 Atari games in just 390 million frames — two orders of magnitude faster than Agent57.

by Synced 2022-09-19 6

AI Machine Learning & Data Science Research

DeepMind’s ‘Expert-Aware’ Data Augmentation Technique Enables Data-Efficient Learning from Parametric Experts

The new DeepMind paper Data Augmentation for Efficient Learning from Parametric Experts proposes Augmented Policy Cloning (APC), a simple yet effective data-augmentation approach designed to support data-efficient learning from parametric experts. The method significantly improves data efficiency across various control and reinforcement learning settings.

by Synced 2022-09-14 1

AI Machine Learning & Data Science Research

DeepMind’s Model-Based Offline Options Framework Supports Automatic Skill & Behaviour Discovery, Boosts Transfer Capabilities

In the new paper MO2: Model-Based Offline Options, a DeepMind research team introduces Model-Based Offline Options (MO2), an offline hindsight bottleneck options framework that supports sample-efficient option discovery over continuous state-action spaces for efficient skill transfer to new tasks.

by Synced 2022-07-21 1

AI Machine Learning & Data Science Research

DeepMind & UCL’s Stochastic MuZero Achieves SOTA Results in Complex Stochastic Environments

In the new paper Planning in Stochastic Environments with a Learned Model, a research team from DeepMind and University College London extends the deterministic MuZero model to Stochastic MuZero for stochastic model learning, achieving performance comparable or superior to state-of-the-art methods in complex single- and multi-agent environments.

by Synced 2022-07-04 2

AI Machine Learning & Data Science Research

Learning Without Simulations? UC Berkeley’s DayDreamer Establishes a Strong Baseline for Real-World Robotic Training

In the new paper DayDreamer: World Models for Physical Robot Learning, researchers from the University of California, Berkeley leverage recent advances in the Dreamer world model to enable online reinforcement learning for robot training without simulators or demonstrations, establishing a strong baseline for efficient real-world robotic learning.

by Synced 2022-06-21 2

AI Machine Learning & Data Science Research

DeepMind Boosts RL Agents’ Retrieval Capability to Tens of Millions of Pieces of Information

In the new paper Large-Scale Retrieval for Reinforcement Learning, a DeepMind research team dramatically expands the information accessible to reinforcement learning (RL) agents, enabling them to attend to tens of millions of information pieces, incorporate new information without retraining, and learn decision making in an end-to-end manner.

by Synced 2022-06-03 1

AI Machine Learning & Data Science Research

NVIDIA & UW Introduce Factory: A Set of Physics Simulation Methods and Learning Tools for Contact-Rich Robotic Assembly

In the new paper Factory: Fast Contact for Robotic Assembly, a research team from NVIDIA and the University of Washington introduces Factory, a set of physics simulation methods and robot learning tools for simulating contact-rich interactions in assembly with high accuracy, efficiency, and robustness.

by Synced 2022-05-20 0

AI Machine Learning & Data Science Research

Huawei Rethinks Logical Synthesis, Proposing a Practical RL-based Approach That Achieves High Efficiency

In the new paper Rethinking Reinforcement Learning Based Logic Synthesis, a research team from Huawei Noah’s Ark Lab develops a novel reinforcement learning-based logic synthesis method to automatically recognize critical operators and produce common operator sequences that are generalizable to unseen circuits.

by Synced 2022-03-08 1

AI Machine Learning & Data Science Research

OpenAI’s AutoDIME: Automating Multi-Agent Environment Design for RL Agents

In the new paper AutoDIME: Automatic Design of Interesting Multi-Agent Environments, an OpenAI research team explores automatic environment design for multi-agent environments using an RL-trained teacher that samples environments to maximize student learning. The work demonstrates that intrinsic teacher rewards are a promising approach for automating both single and multi-agent environment design.

by Synced 2022-02-22 9

AI Machine Learning & Data Science Research

DeepMind Trains Agents to Control Computers as Humans Do to Solve Everyday Tasks

DeepMind trains agents to use keyboard and mouse commands with pixel and Document Object Model (DOM) observations to control computers, achieving state-of-the-art and human-level mean performance across all tasks on the MiniWob++ benchmark.

by Synced 2022-02-17 7

AI Machine Learning & Data Science Research

DeepMind & UCL Propose Neural Population Learning: An Efficient and General Framework That Learns Strategically Diverse Policies for Real-World Games

A research team from DeepMind and University College London proposes Neural Population Learning (NeuPL), an efficient and general framework that learns and represents diverse policies in symmetric zero-sum games within a single conditional network.

by Synced 2022-01-28 1

AI Machine Learning & Data Science Research

OpenAI’s InstructGPT Leverages RL From Human Feedback to Better Align Language Models With User Intent

An OpenAI research team leverages reinforcement learning from human feedback (RLHF) to make significant progress on aligning language models with the users’ intentions. The proposed InstructGPT models are better at following instructions than GPT-3 while also more truthful and less toxic.

by Synced 2022-01-21 3

AI Machine Learning & Data Science Research

UC Irvine & DeepMind’s Anytime Optimal PSRO: Guaranteed Convergence to a Nash Equilibrium With Decreased Exploitability in Two-Player Zero-Sum Games

A research team from the University of California Irvine and DeepMind proposes Anytime Optimal PSRO, a new PSRO variant for two-player zero-sum games that is guaranteed to converge to a Nash equilibrium while decreasing exploitability from iteration to iteration.

by Synced 2021-12-08 1

AI Machine Learning & Data Science Research

DeepMind’s PoG Excels in Perfect and Imperfect Information Games, Advancing Research on General Algorithms for Arbitrary Environments

DeepMind researchers introduce Player of Games (PoG), a general-purpose algorithm that applies self-play learning, search, and game-theoretic reasoning to perfect and imperfect information games, taking an important step toward truly general algorithms for arbitrary environments.

by Synced 2021-12-07 2

AI Machine Learning & Data Science Research

UC Berkeley’s Sergey Levine Says Combining Self-Supervised and Offline RL Could Enable Algorithms That Understand the World Through Actions

In the new paper Understanding the World Through Action, UC Berkeley assistant professor in the department of electrical engineering and computer sciences Sergey Levine argues that a general, principled, and powerful framework for utilizing unlabelled data can be derived from reinforcement learning to enable machine learning systems leveraging large datasets to understand the real world.

by Synced 2021-11-24 0

AI Machine Learning & Data Science Research

DeepMind, Google Brain & World Chess Champion Explore How AlphaZero Learns Chess Knowledge

DeepMind and Google Brain researchers and former World Chess Champion Vladimir Kramnik explore how human knowledge is acquired and how chess concepts are represented in the AlphaZero neural network via concept probing, behavioural analysis, and an examination of its activations.

by Synced 2021-10-25 1

AI Machine Learning & Data Science Research

Facebook AI Releases SaLinA: A Flexible and Simple Library for Learning Sequential Agents

A Facebook AI research team releases SaLinA, a reinforcement learning (RL) library for model-based RL, differentiable environments and multi-agent RL that simplifies the implementation of complex sequential learning models.

by Synced 2021-10-21 0

AI Machine Learning & Data Science Research

DeepMind’s Fictitious Co-Play Trains RL Agents to Collaborate with Novel Humans Without Using Human Data

A DeepMind research team explores the problem of how to train agents to collaborate well with novel human partners without using human data and presents Fictitious Co-Play (FCP), a surprisingly simple approach designed to address this challenge.

by Synced 2021-09-28 5

AI Machine Learning & Data Science Research

DeepMind & IDSIA Introduce Symmetries to Black-Box MetaRL to Improve Its Generalization Ability

In the paper Introducing Symmetries to Black Box Meta Reinforcement Learning, a research team from DeepMind and The Swiss AI Lab IDSIA explores the role of symmetries in meta generalization and shows that introducing more symmetries to black-box meta-learners can improve their ability to generalize to unseen action and observation spaces, tasks, and environments.

by Synced 2021-09-15 1

AI Machine Learning & Data Science Research

CMU, Google & UC Berkeley Propose Robust Predictable Control Policies for RL Agents

A research team from Carnegie Mellon University, Google Brain and UC Berkeley proposes a robust predictable control (RPC) method for learning reinforcement learning policies that use fewer bits of information. This simple and theoretically-justified algorithm achieves much tighter compression, is more robust, and generalizes better than prior methods, achieving up to 5× higher rewards than a standard information bottleneck.

by Synced 2021-09-10 2

AI Machine Learning & Data Science Research

Google Study Uses Implicit Policies to Achieve Remarkable Improvements in Robot Behavioural Cloning

Researchers from Robotics at Google propose reformulating behavioural cloning using implicit models and show that this simple change can lead to remarkable improvements in performance across a wide range of contact-rich robot policy learning.

by Synced 2021-09-08 28

AI Machine Learning & Data Science Research

IBM Leverages Reinforcement Learning to Achieve SOTA Performance on Text and Knowledge Base Generation

In the paper ReGen: Reinforcement Learning for Text and Knowledge Base Generation Using Pretrained Language Models, IBM researchers present ReGen, a bidirectional generation of text and graph that leverages reinforcement learning to push the performance of text-to-graph and graph-to-text generation tasks to a higher level.

by Synced 2021-09-03 2

AI Machine Learning & Data Science Research

Stanford’s BEHAVIOR Benchmarks 100 Activities From Everyday Life for Embodied AI

A research team from Stanford University introduces BEHAVIOR, a benchmark for embodied AI with 100 realistic, diverse and complex everyday household activities in simulation. BEHAVIOR addresses challenges such as definition, instantiation in a simulator, and evaluation; and pushes the state-of-the-art by adding new types of state changes.

by Synced 2021-08-31 3

AI Machine Learning & Data Science Research

DeepMind’s Collect & Infer: A Fresh Look at Data-Efficient Reinforcement Learning

A DeepMind research team proposes Collect and Infer, a novel paradigm that explicitly models Reinforcement Learning (RL) as data collection and knowledge inference to dramatically boost RL data efficiency.

by Synced 2021-07-30 2

AI Machine Learning & Data Science Research

Accelerating Quadratic Optimization Up to 3x With Reinforcement Learning

A research team from the University of California, Princeton University and ETH Zurich proposes RLQP, an accelerated QP solver based on operator-splitting QP (OSQP) that uses deep reinforcement learning (RL) to speed up the solver’s convergence rate.

by Synced 2021-06-29 2

AI Machine Learning & Data Science Research

DeepMind & Amii Extend Emphatic Algorithms for Deep RL, Improving Performance on Atari Games

A research team from DeepMind and Amii extends the emphatic method to multi-step deep reinforcement learning (RL) targets, and demonstrates that combining emphatic trace with deep neural networks can improve performance on classic Atari video games.

by Synced 2021-06-16 1

AI Machine Learning & Data Science Research

Bengio Team Proposes Flow Network-Based Generative Models That Learn a Stochastic Policy From a Sequence of Actions

A research team from Mila, McGill University, Université de Montréal, DeepMind and Microsoft proposes GFlowNet, a novel flow network-based generative method that can turn a given positive reward into a generative policy that samples with a probability proportional to the return.v

by Synced 2021-06-11 3

AI Machine Learning & Data Science Popular Research

Yoshua Bengio Team Designs Consciousness-Inspired Planning Agent for Model-Based RL

A research team from McGill University, Université de Montréal, DeepMind and Mila presents an end-to-end, model-based deep reinforcement learning (RL) agent that dynamically attends to relevant parts of its environments to facilitate out-of-distribution (OOD) and systematic generalization.

by Synced 2021-06-09 2

AI Machine Learning & Data Science Research

Pieter Abbeel Team’s Decision Transformer Abstracts RL as Sequence Modelling

A research team from UC Berkeley, Facebook AI Research and Google Brain abstracts Reinforcement Learning (RL) as a sequence modelling problem. Their proposed Decision Transformer simply outputs optimal actions by leveraging a causally masked transformer, yet matches or exceeds state-of-the-art model-free offline RL baselines on Atari, OpenAI Gym, and Key-to-Door tasks.

by Synced 2021-06-08 2

AI Machine Learning & Data Science Research

What Matters in Adversarial Imitation Learning? Google Brain Study Reveals Valuable Insights

A research team from Google Brain conducts a comprehensive empirical study on more than fifty choices in a generic adversarial imitation learning framework and explores their impacts on large-scale (>500k trained agents) continuous-control tasks to provide practical insights and recommendations for designing novel and effective AIL algorithms.

by Synced 2021-05-25 2

AI Machine Learning & Data Science Research

Yoshua Bengio Team’s Recurrent Independent Mechanisms Endow RL Agents With Out-of-Distribution Adaptation and Generalization Abilities

A research team from the University of Montreal and Max Planck Institute for Intelligent Systems constructs a reinforcement learning agent whose knowledge and reward function can be reused across tasks, along with an attention mechanism that dynamically selects unchangeable knowledge pieces to enable out-of-distribution adaptation and generalization.

by Synced 2021-05-07 4

AI Machine Learning & Data Science Research

MIT & IBM ‘Curiosity’ Framework Explores Embodied Environments to Learn Task-Agnostic Visual Representations

A research team from MIT and MIT-IBM Watson AI Lab proposes Curious Representation Learning (CRL), a framework that learns to understand the surrounding environment by training a reinforcement learning (RL) agent to maximize the error of a representation learner to gain an incentive to explore the environment.

by Synced 2021-04-21 3

AI Machine Learning & Data Science Popular Research

Pieter Abbeel Team Proposes Task-Agnostic RL Method to Auto-Tune Simulations to the Real World

A research team from UC Berkeley and Carnegie Mellon University proposes a task-agnostic reinforcement learning method that reduces the task-specific engineering required for domain randomization of both visual and dynamics parameters.