“Dota 2 With Large Scale Deep Reinforcement Learning”, Christopher Berner, Greg Brockman, Brooke Chan, Vicki Cheung, Przemysław Dębiak, Christy Dennison, David Farhi, Quirin Fischer, Shariq Hashme, Chris Hesse, Rafal Józefowicz, Scott Gray, Catherine Olsson, Jakub Pachocki, Michael Petrov, Henrique P. d. O. Pinto, Jonathan Raiman, Tim Salimans, Jeremy Schlatter, Jonas Schneider, Szymon Sidor, Ilya Sutskever, Jie Tang, Filip Wolski, Susan Zhang2019-12-13 (; similar)⁠:

On April 13, 2019, OpenAI 5 became the first AI system to defeat the world champions at an esports game.

The game of Dota 2 presents novel challenges for AI systems such as long time horizons, imperfect information, and complex, continuous state-action spaces, all challenges which will become increasingly central to more capable AI systems.

OpenAI Five leveraged existing reinforcement learning techniques, scaled to learn from batches of ~2 million frames every 2 seconds. We developed a distributed training system and tools for continual training which allowed us to train OpenAI 5 for 10 months.

By defeating the Dota 2 world champion (Team OG), OpenAI 5 demonstrates that self-play reinforcement learning can achieve superhuman performance on a difficult task.