“Rainbow: Combining Improvements in Deep Reinforcement Learning”, Matteo Hessel, Joseph Modayil, Hado van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan, Bilal Piot, Mohammad Azar, David Silver2017-10-06 (; similar)⁠:

The deep reinforcement learning community has made several independent improvements to the DQN algorithm. However, it is unclear which of these extensions are complementary and can be fruitfully combined.

This paper examines 6 extensions to the DQN algorithm [DDQN & dueling networks, prioritized replay, multi-step learning/n-step returns, distributional RL, Noisy Nets] and empirically studies their combination.

Our experiments show that the combination provides state-of-the-art performance on the ALE benchmark, both in terms of data efficiency and final performance.

We also provide results from a detailed ablation study that shows the contribution of each component to overall performance.