Bibliography (5):
The Value Equivalence Principle for Model-Based Reinforcement Learning
MuZero: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
Muesli: Combining Improvements in Policy Optimization
Wikipedia Bibliography:
Reinforcement learning
Loss function