Simulated robots can now spin-kick like a karate expert or backflip like an acrobat. The Berkeley Artificial Intelligence Research (BAIR) Lab yesterday proposed DeepMimic, a Reinforcement Learning (RL) technique that enables simulated characters to regenerate highly dynamic physical movements learned from data collected from human subjects. BAIR is a top-tier research lab focused on computer vision, machine learning, natural language processing, and robotics.
RL methods have been shown to be applicable to a diverse suite of robotic tasks, particularly motion control problems. A typical RL includes a policy function that consists of all action selections that machines can do, and a value function that returns a low or high reward each time a machine takes an action. Machines can self-learn skills by leveraging the reward. The epoch-making Go computer AlphaGo produced by DeepMind is grounded on the same technique.
However, virtual characters trained with deep RL can exhibit abnormal behaviours such as jittering, asymmetric gaits, or excessive movement of limbs.
BAIR’s new paper DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skill introduces a policy function that collects challenging skills such as locomotion, acrobatics, martial arts, and dancing.
BAIR next initialises a character to a state sampled randomly, a method known as Reference State Initialization (RSI). The character can learn skills from any state of moves, such as the inflection point of a flip, and RSI can allow the character to know which states will result in high rewards even before it has acquired the proficiency to reach those states.
By connecting RSI with Early Termination (ET), a standard practice for RL researchers to stop simulations that lead to failure, BAIR researchers ensured that a substantial proportion of the dataset consists of samples close to the reference trajectory. Without ET, the character may flail or fall, but will not learn to flip.
The research shows that the character can learn over 24 skills, with movements nearly indistinguishable from the human reference subjects. BAIR also says its technique is simpler and produces better results than the current leading motion imitation method, Generative Adversarial Imitation Learning (GAIL).
BAIR hopes the new research will facilitate the development of more dynamic motor skills for both simulated characters and robots in the real world.
Author: Paul Fan| Editor: Tony Peng, Michael Sarazen