Text games have emerged as a fundamental research paradigm for learning and evaluating embodied agents on natural language processing (NLP) tasks. The tooling in this research area however is built on legacy code bases that limit the games to running at 1-300 steps per second, which can entail weeks of experiment time for sample-heavy reinforcement or evolutionary learning agents.
A research team from the University of Arizona and Microsoft Research Montréal addresses this issue in the new paper TextWorldExpress: Simulating Text Games at One Million Steps Per Second, proposing a high-performance text-game simulator that boosts throughput by approximately three orders of magnitude, reaching one million steps per second (SPS).
The team summarizes their main contributions as follows:
- TextWorldExpress is a highly optimized reimplementation of three text game benchmarks focusing on instruction following, commonsense reasoning and object identification.
- We empirically demonstrate that this simulator runs up to three orders of magnitude faster than current tooling, reaching 300k steps per second (SPS) on a single thread and exceeding 1M SPS on modest multi-core desktop hardware.
TextWorldExpress is built on heavily optimized and profiled code that enables it to quickly render environments and simultaneously generate an exhaustive set of possible actions for agents to significantly speedup simulation time. Moreover, TextWorldExpress simulations can be done entirely on CPU cores, so it can perform large-scale simulations without requiring expensive multi-GPU nodes.
In their empirical study, the team compared the TextWorldExpress with three popular benchmark environments — TextWorld (Côté et al., 2018), Jericho (Hausknecht et al., 2020), and ScienceWorld (Wang et al., 2022) — on tasks from three action spaces: CookingWorld, TextWorld Commonsense and Coin Collector. The results show that TextWorldExpress can effectively simulate in online generation mode at an average speed of 212k frames per second per thread, almost three orders of magnitude faster than current simulators. On multi-core workstations, TextWorldExpress can reach over one million steps per second.
Overall, the study shows that the proposed TextWorldExpress can significantly reduce experiment runtimes, enabling researchers to conduct billion-step-scale experiments in about one day.
Author: Hecate He | Editor: Michael Sarazen
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.