Deep reinforcement learning (DRL) has become a hot research area in recent years, achieving superhuman performance on sophisticated decision-making problems that range from strategy and computer games to robot training.
Despite its success, DRL’s scalability remains impeded by two critical bottlenecks: enormous computational requirements and limited simulation speed. One way to mitigate this is to employ hardware accelerators such as GPUs, but this approach has only been applied to simplified physics-based scenarios, not complex robotic environments.
To address these issues, An Nvidia research team has introduced Isaac Gym — a high performance robotics simulation platform that runs an end-to-end GPU accelerated training pipeline. Compared to conventional RL training approaches that use a CPU based simulator and GPU for neural networks, Isaac Gym achieves training speedups of 2-3 orders of magnitude on continuous control tasks.
The team summarizes their contributions as:
- Development of a high-fidelity GPU-accelerated robotics simulator for robot learning tasks.
- A Tensor API in Python provides direct access to physics buffers by wrapping them into PyTorch tensors without going through any CPU bottlenecks.
- Implementation of multiple highly complex robotic manipulation environments which can be simulated at hundreds of thousands of steps per second on a single GPU.
- High-performance training results using Isaac Gym with deep reinforcement learning on challenging robotic environments.
Previous research has shown that running physics simulations on GPUs can result in significant speedups via parallelize computations. Traditional physical engines however only use GPUs as a co-processor to accelerate the physics simulation, while the API for obtaining physics states and applying controls remains CPU-based. When observations and rewards are done on a CPU, it is necessary to transfer the latest physics state from the GPU to compute observations and rewards. This data transfer process can incur nontrivial overhead and may not be suitable for large and complex simulations.
The proposed Isaac Gym leverages NVIDIA PhysX system software to provide a GPU-accelerated simulation back-end. This means that stepping physics, computing observations and rewards, and applying actions are all performed on the GPU, eliminating the need to copy large quantities of data between devices.
Isaac Gym also provides a data abstraction layer over the physics engine to support multiple physics engines with a shared front-end API. Users can also access all of the physics data in flat buffers, eliminating the significant overhead produced by looping over tens of thousands of individual simulation actors.
To demonstrate Isaac Gym’s policy training performance on a single GPU, the team benchmarked on eight different environments with a wide range of complexity: Ant, Humanoid, Franka-cube-stack, Ingenuity, Shadow Hand, ANYmal, Allegro, and TriFinger.
The team summarizes the results of their evaluations as:
- We achieve significant speedups in training various simulated environments: Ant and Humanoid environments can achieve performant locomotion in 20 seconds and 4 minutes respectively, ANYmal in under 2 minutes, Humanoid character animation using AMP in 6 minutes and cube rotation with Shadow Hand in 35 minutes all on a single NVIDIA A100 GPU.
- Additionally, we reproduce OpenAI Shadow Hand cube training setup with asymmetric actor-critic and domain randomization. We show that we can achieve similar performance to OpenAI results of 20 consecutive successes with feed forward and 37 consecutive successes with LSTM networks with a success tolerance of 0.4 rad in about 1 hour and 6 hours on an average respectively on A100. In contrast, the OpenAI effort required 30 hours and 17 hours respectively on a combination of a CPU cluster (384 CPUs with 16 cores each) and 8 NVIDIA V100 GPUs with MuJoCo using a conventional RL training setup. It is worth mentioning that since OpenAI shows results with only 1 seed, comparing our best seed we find that we achieve 37 consecutive successes with LSTMs in just 2.5 hours.
- We also demonstrate sim-to-real transfer results on ANYmal and TriFinger, which further showcases the ability of our simulator to perform high-fidelity contact rich manipulation.
The paper Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning is on arXiv.
Author: Hecate He | Editor: Michael Sarazen, Chain Zhang
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.