Stanford University researchers have proposed DERL (Deep Evolutionary Reinforcement Learning), a novel computational framework that enables AI agents to evolve morphologies and learn challenging locomotion and manipulation tasks in complex environments using only low level egocentric sensory information. The team says DERL is the first demonstration of the Darwinian Baldwin Effect realized via morphological learning.
In 1953, US paleontologist George Gaylord Simpson coined the term “Baldwin Effect”in reference to the American philosopher and psychologist J.M. Baldwin’s 1896 paper A New Factor in Evolution. In evolutionary biology, the Baldwin effect proposes that behaviours initially learned over a lifetime in early generations of an evolutionary process will gradually become instinctual and potentially even genetically transmitted to later generations.
Previous studies on learning and evolutionary processes in complex environments with a diversity of morphological forms have identified many aspects of animal intelligence that are embodied in these evolved morphologies. Until now, however, no studies have demonstrated the Baldwin effect in morphological evolution either in vivo (living organisms) or in silico (computer modelling or simulations).
Fei-Fei Li, Stanford computer science professor and co-director of Stanford’s Human-Centered AI Institute (HAI), co-authored the paper Embodied Intelligence via Learning and Evolution. “Really excited by this joint work with [Agrim Gupta, Silvio Savarese and Surya Ganguli] – meet DERL (Deep Evolutionary RL) & 1st demonstration of a Darwinian Baldwin Effect via morphological learning, an essential trick of Nature for animal evolution, now shown in our AI agents,” Li tweeted on the paper’s release.

The researchers identify the combinatorially large number of possible morphologies and the computational time required to evaluate fitness through lifetime learning as the major challenges they faced in creating their AI embodied agents.
Unlike previous work that focused on identifying evolved agents in limited morphological search spaces or finding optimal parameters based on a fixed hand-designed morphology, DERL is a computational framework that enables researchers to simultaneously scale the creation of embodied agents across three types of complexity: environmental, morphological, and control. The team built UNIMAL (UNIversal aniMAL), a design space that enables highly expressive and useful controllable morphologies in agents, and analyzed the resulting embodied agents in three environments: hills, steps and rubble.
The team says DERL demonstrates several relations between environmental complexity, morphological intelligence and the learnability of control:
- Environmental complexity fosters the evolution of morphological intelligence as quantified by the ability of a morphology to facilitate the learning of novel tasks
- Evolution rapidly selects morphologies that learn faster, thereby enabling behaviours learned late in the lifetime of early ancestors to be expressed early in the lifetime of their descendants.
- Experiments suggest a mechanistic basis for both the Baldwin effect and the emergence of morphological intelligence through the evolution of morphologies that are more physically stable and energy efficient, and can facilitate learning and control.

The team hopes DERL’s large-scale simulations can encourage further scientific explorations on learning and evolution that could lead to rapidly learnable intelligent behaviours in RL agents.
The paper Embodied Intelligence via Learning and Evolution is available on arXiv.
Reporter: Fangyu Cai | Editor: Michael Sarazen

We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

Synced Report | A Survey of China’s Artificial Intelligence Solutions in Response to the COVID-19 Pandemic — 87 Case Studies from 700+ AI Vendors
This report offers a look at how China has leveraged artificial intelligence technologies in the battle against COVID-19. It is also available on Amazon Kindle. Along with this report, we also introduced a database covering additional 1428 artificial intelligence solutions from 12 pandemic scenarios.
Click here to find more reports from us.
Pingback: [N] Stanford University Deep Evolutionary RL Framework Demonstrates Embodied Intelligence via Learning and Evolution – ONEO AI
Pingback: [N] Stanford University Deep Evolutionary RL Framework Demonstrates Embodied Intelligence via Learning and Evolution : MachineLearning - Xandoblogs
Pingback: Stanford University Deep Evolutionary Reinforcement Learning Framework Demonstrates Embodied Intelligence via Learning and Evolution – ONEO AI
So it is why I could no longer find that evolution app I had in one of the phones stolen from me?! It was very funny! And indeed a as giving results. So I do hope these men are not using (?) my bifringe frame for Evolution of Man in Beaches, where you can work out how feet and hands, brain, perception, dimorphism, erectness, social forms, etc., evolved by virtue of adapting to being forced to cross two boundaries over three eco-systems, not just reacting to benefits from work specialization (niches) over a single ecosystem. On that idea the bifringe frame should be general and able to provide evolutionary momentum to emulations over abstract space environments (incidentally, an assumption is the main effector, hands, is already in place, which is not a very hard assumption as articulated hands can be derived by minimizations from applying the Principle of Mimetism to an environment where ramification and bifurcation are already set, ie, vegetation). It is post in the Usenet, since quite a while ago, and evidently not very much considered! As for the Baldwin Effect… well, because of computational efficiency gains it is very much expected most basic operating functions end compiled at the innermost levels of processing where indeed, very small decisions among possibilities (chaos) lead to very notorious differences at the outer more mundane operating functions, between species or taxa, but from there no Lamarckism should be implied! Selective process transmission is enough. Rather, breeding over Life Closures provides a similar effect short of direct genetic manipulation, but such selection already implies evolution over a complex environment produced some level of intelligence, at least to the point of species traits identification. Indeed, I promoted my frame as a scheme able to produce AIs by GA processes albeit possibly very strange in accordance to the environments it is set against, vg, OS interoperability. Am I right they are bound to publish the source code as investigation tool? When can we see the Android version? Thieves even got my backup of that app, in another TracFone.
Pingback: Stanford University Deep Evolutionary RL Framework Demonstrates Embodied Intelligence via Learning and Evolution | Synced – Quantum and Photonics Systems
It was excellent and good.
good article thanks