Bibliography (3):

A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play
MuZero: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
https://www.lesswrong.com/tag/aixi