Bibliography (5):

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
R2D2: Recurrent Experience Replay in Distributed Reinforcement Learning
DeepMind Lab
Wikipedia Bibliography:
1. Reinforcement learning
2. Q-learning