Bibliography (8):
Proximal Policy Optimization Algorithms
https://github.com/ml-jku/rudder
https://www.youtube.com/playlist?list=PLDfrC-Vpg-CzVTqSjxVeLQZy3f7iv9vyY
Wikipedia Bibliography:
Reinforcement learning
Markov decision process
Variance
Monte Carlo method
Monte Carlo tree search