Stanford University researchers have proposed DERL (Deep Evolutionary Reinforcement Learning), a novel computational framework that enables AI agents to evolve morphologies and learn challenging locomotion and manipulation tasks in complex environments using only low level egocentric sensory information. The team says DERL is the first demonstration of the Darwinian Baldwin Effect realized via morphological learning.



In 1953, US paleontologist George Gaylord Simpson coined the term “Baldwin Effect”in reference to the American philosopher and psychologist J.M. Baldwin’s 1896 paper A New Factor in Evolution. In evolutionary biology, the Baldwin effect proposes that behaviours initially learned over a lifetime in early generations of an evolutionary process will gradually become instinctual and potentially even genetically transmitted to later generations.



Previous studies on learning and evolutionary processes in complex environments with a diversity of morphological forms have identified many aspects of animal intelligence that are embodied in these evolved morphologies. Until now, however, no studies have demonstrated the Baldwin effect in morphological evolution either in vivo (living organisms) or in silico (computer modelling or simulations).



Fei-Fei Li, Stanford computer science professor and co-director of Stanford’s Human-Centered AI Institute (HAI), co-authored the paper Embodied Intelligence via Learning and Evolution. “Really excited by this joint work with [Agrim Gupta, Silvio Savarese and Surya Ganguli] – meet DERL (Deep Evolutionary RL) & 1st demonstration of a Darwinian Baldwin Effect via morphological learning, an essential trick of Nature for animal evolution, now shown in our AI agents,” Li tweeted on the paper’s release.

The researchers identify the combinatorially large number of possible morphologies and the computational time required to evaluate fitness through lifetime learning as the major challenges they faced in creating their AI embodied agents.



Unlike previous work that focused on identifying evolved agents in limited morphological search spaces or finding optimal parameters based on a fixed hand-designed morphology, DERL is a computational framework that enables researchers to simultaneously scale the creation of embodied agents across three types of complexity: environmental, morphological, and control. The team built UNIMAL (UNIversal aniMAL), a design space that enables highly expressive and useful controllable morphologies in agents, and analyzed the resulting embodied agents in three environments: hills, steps and rubble.



The team says DERL demonstrates several relations between environmental complexity, morphological intelligence and the learnability of control:

Environmental complexity fosters the evolution of morphological intelligence as quantified by the ability of a morphology to facilitate the learning of novel tasks

Evolution rapidly selects morphologies that learn faster, thereby enabling behaviours learned late in the lifetime of early ancestors to be expressed early in the lifetime of their descendants.

Experiments suggest a mechanistic basis for both the Baldwin effect and the emergence of morphological intelligence through the evolution of morphologies that are more physically stable and energy efficient, and can facilitate learning and control.

Evolutionary dynamics in multiple environments

The team hopes DERL’s large-scale simulations can encourage further scientific explorations on learning and evolution that could lead to rapidly learnable intelligent behaviours in RL agents.



The paper Embodied Intelligence via Learning and Evolution is available on arXiv.

Reporter: Fangyu Cai | Editor: Michael Sarazen

