Bibliography (8):

  1. Value Iteration Networks

  2. The Predictron: End-To-End Learning and Planning

  3. Value Prediction Network

  4. TreeQN & ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning

  5. MuZero: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

  6. Proper Value Equivalence