In a new paper General Part Assembly Planning, a research team from Columbia University and Google DeepMind introduces General Part Assembly Transformer (GPAT), a transformer-based model for assembly planning that has strong generalization capability to automatically estimate novel and diverse target and part shapes.
In a new paper Language to Rewards for Robotic Skill Synthesis, a Google DeepMind research team proposes a new paradigm to leverage reward functions to interface language and low-level robot actions, which enables non-technical users to steer novel and intricate robot actions without large amount of data or expert knowledge to engineer low-level primitives.
In the new paper DayDreamer: World Models for Physical Robot Learning, researchers from the University of California, Berkeley leverage recent advances in the Dreamer world model to enable online reinforcement learning for robot training without simulators or demonstrations, establishing a strong baseline for efficient real-world robotic learning.
In the new paper Large-Scale Retrieval for Reinforcement Learning, a DeepMind research team dramatically expands the information accessible to reinforcement learning (RL) agents, enabling them to attend to tens of millions of information pieces, incorporate new information without retraining, and learn decision making in an end-to-end manner.
In the new paper Factory: Fast Contact for Robotic Assembly, a research team from NVIDIA and the University of Washington introduces Factory, a set of physics simulation methods and robot learning tools for simulating contact-rich interactions in assembly with high accuracy, efficiency, and robustness.
A Nvidia research team presents Isaac Gym — a high-performance robotics simulation platform that runs an end-to-end GPU accelerated training pipeline. Compared to conventional RL training methods that use a CPU-based simulator and GPU for neural networks, Isaac Gym achieves training speedups of 2-3 orders of magnitude on continuous control tasks.
Researchers from The Chinese University of Hong Kong, Facebook Reality Labs, and Facebook AI Research have unveiled a state-of-the-art monocular 3D hand motion capture method, FrankMocap, which can estimate both 3D hand and body motions from in-the-wild monocular inputs with faster speed and better accuracy than previous approaches.