Reinforce Learning (RL) plays a central role in developing Artificial Intelligence (AI) agents that can make smart decisions based on experience. Current understandings of RL agents however are restricted on agents that learn to address issues, rather then forever learning.
In a new paper A Definition of Continual Reinforcement LearningA Definition of Continual Reinforcement Learning, a DeepMind research team rethinks RL problems as endless adaptation and provides a clean, general, precise mathematical definition of continual reinforcement learning (CRL), aiming to promote researches on CRL from a solid conceptual foundation.

The team starts by defining environments, agents, and related artifacts. They treat an agent-environment interface as pairs of countable sets of actions and observations, the histories are sequences of action-observation pairs that represent the possible interactions between an agent and an environment. Therefore both the environment and agent can be define as functions that respect to the agent-environment interface.
They provide an informal definition the CRL problem as “An RL problem is an instance of CRL if the best agents never stop learning” and summarize two new insights that formalize the core definitions as follows:
- We can understand every agent as implicitly searching over a set of behaviors.
- Every agent will either continue this search forever, or eventually stop.
To formalizes these two insights, the researchers introduce a pair of operators on agents: 1) any set of agents generates another set of agents and 2) a given agent reaches an agent set to define learning as the implicit search process, and continual learning as the continual of this search process indefinitely.

With the abovementioned premises, the team formalizes the intuition of CRL as capturing settings in which the best agents do not converge, more intuitively, the agents will continue their implicit search over the base behaviors forever. This definition encourages researchers or developers to design an agents from a new perspective: instead of building an agent that aims at solving problems, agents that continue to update their behaviors indefinitely based on their experience will be preferred.
Overall, this work builds a solid foundation of Continual Reinforcement Learning, and the team also provides guides on designing principled continual learning agents. They claim in the future work they will further explore connections between the formalism of continual learning and some phenomena from recent empirical studies.
The paper A Definition of Continual Reinforcement Learning on arXiv.
Author: Hecate He | Editor: Chain Zhang

We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.
0 comments on “DeepMind Builds A Precise Mathematical Foundation of Continual Reinforcement Learning”