Transformers Meet Online RL: New Study Unifies Offline Pretraining and Online Finetuning, Achieves SOTA Results
A team from Facebook AI Research, UC Berkeley and UCLA proposes Online Decision Transformers (ODT), an RL algorithm based on sequence modelling that incorporates offline pretraining and online finetuning in a unified framework and achieves performance competitive with the state-of-the-art models on the D4RL benchmark.