“Dota 2 With Large Scale Deep Reinforcement Learning”, 2019-12-13 (; similar):
On April 13, 2019, OpenAI 5 became the first AI system to defeat the world champions at an esports game.
The game of Dota 2 presents novel challenges for AI systems such as long time horizons, imperfect information, and complex, continuous state-action spaces, all challenges which will become increasingly central to more capable AI systems.
OpenAI Five leveraged existing reinforcement learning techniques, scaled to learn from batches of ~2 million frames every 2 seconds. We developed a distributed training system and tools for continual training which allowed us to train OpenAI 5 for 10 months.
By defeating the Dota 2 world champion (Team OG), OpenAI 5 demonstrates that self-play reinforcement learning can achieve superhuman performance on a difficult task.