- See Also
-
Links
- “Job Hunt As a PhD in RL: How It Actually Happens § Reinforcement Learning Reflections”, 2022
- “Large-Scale Retrieval for Reinforcement Learning”, Et Al 2022
- “Boosting Search Engines With Interactive Agents”, Et Al 2022
- “Stochastic MuZero: Planning in Stochastic Environments With a Learned Model”, Et Al 2022
- “Policy Improvement by Planning With Gumbel”, Et Al 2022
- “MuZero With Self-competition for Rate Control in VP9 Video Compression”, Et Al 2022
- “Procedural Generalization by Planning With Self-Supervised World Models”, Et Al 2021
- “Mastering Atari Games With Limited Data”, Et Al 2021
- “Proper Value Equivalence”, Et Al 2021
- “Vector Quantized Models for Planning”, Et Al 2021
- “Learning and Planning in Complex Action Spaces”, Et Al 2021
- “MuZero Unplugged: Online and Offline Reinforcement Learning by Planning With a Learned Model”, Et Al 2021
- “Podracer Architectures for Scalable Reinforcement Learning”, Et Al 2021
- “Muesli: Combining Improvements in Policy Optimization”, Et Al 2021
- “Scaling Scaling Laws With Board Games”, 2021
- “Playing Nondeterministic Games through Planning With a Learned Model”, 2021
- “Visualizing MuZero Models”, Et Al 2021
- “Combining Off and On-Policy Training in Model-Based Reinforcement Learning”, 2021
- “Improving Model-Based Reinforcement Learning With Internal State Representations through Self-Supervision”, Et Al 2021
- “On the Role of Planning in Model-based Deep Reinforcement Learning”, Et Al 2020
- “The Value Equivalence Principle for Model-Based Reinforcement Learning”, Et Al 2020
- “Measuring Progress in Deep Reinforcement Learning Sample Efficiency”, 2020
- “Monte-Carlo Tree Search As Regularized Policy Optimization”, Et Al 2020
- “Continuous Control for Searching and Planning With a Learned Model”, Et Al 2020
- “Agent57: Outperforming the Human Atari Benchmark”, Et Al 2020
- “MuZero: Mastering Atari, Go, Chess and Shogi by Planning With a Learned Model”, Et Al 2019
- “Surprising Negative Results for Generative Adversarial Tree Search”, Et Al 2018
- “TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning”, Et Al 2017
- Wikipedia
- Miscellaneous
- Link Bibliography
See Also
Links
“Job Hunt As a PhD in RL: How It Actually Happens § Reinforcement Learning Reflections”, 2022
“Job Hunt as a PhD in RL: How it Actually Happens § Reinforcement learning reflections”, 2022-07-05 (similar)
“Large-Scale Retrieval for Reinforcement Learning”, Et Al 2022
“Large-Scale Retrieval for Reinforcement Learning”, 2022-06-10 ( ; similar; bibliography)
“Boosting Search Engines With Interactive Agents”, Et Al 2022
“Boosting Search Engines with Interactive Agents”, 2022-06-04 ( ; similar; bibliography)
“Stochastic MuZero: Planning in Stochastic Environments With a Learned Model”, Et Al 2022
“Stochastic MuZero: Planning in Stochastic Environments with a Learned Model”, 2022-03-15 (similar)
“Policy Improvement by Planning With Gumbel”, Et Al 2022
“Policy improvement by planning with Gumbel”, 2022-03-04 ( ; similar; bibliography)
“MuZero With Self-competition for Rate Control in VP9 Video Compression”, Et Al 2022
“MuZero with Self-competition for Rate Control in VP9 Video Compression”, 2022-02-14 ( ; similar)
“Procedural Generalization by Planning With Self-Supervised World Models”, Et Al 2021
“Procedural Generalization by Planning with Self-Supervised World Models”, 2021-11-02 ( ; similar; bibliography)
“Mastering Atari Games With Limited Data”, Et Al 2021
“Mastering Atari Games with Limited Data”, 2021-10-30 ( ; backlinks; similar)
“Proper Value Equivalence”, Et Al 2021
“Proper Value Equivalence”, 2021-06-18 ( ; similar; bibliography)
“Vector Quantized Models for Planning”, Et Al 2021
“Vector Quantized Models for Planning”, 2021-06-08 ( ; similar)
“Learning and Planning in Complex Action Spaces”, Et Al 2021
“Learning and Planning in Complex Action Spaces”, 2021-04-13 (similar)
“MuZero Unplugged: Online and Offline Reinforcement Learning by Planning With a Learned Model”, Et Al 2021
“MuZero Unplugged: Online and Offline Reinforcement Learning by Planning with a Learned Model”, 2021-04-13 ( ; similar; bibliography)
“Podracer Architectures for Scalable Reinforcement Learning”, Et Al 2021
“Podracer architectures for scalable Reinforcement Learning”, 2021-04-13 ( ; similar; bibliography)
“Muesli: Combining Improvements in Policy Optimization”, Et Al 2021
“Muesli: Combining Improvements in Policy Optimization”, 2021-04-13 ( ; backlinks; similar)
“Scaling Scaling Laws With Board Games”, 2021
“Scaling Scaling Laws with Board Games”, 2021-04-07 ( ; backlinks; similar)
“Playing Nondeterministic Games through Planning With a Learned Model”, 2021
“Playing Nondeterministic Games through Planning with a Learned Model”, 2021-03-05 (backlinks; similar)
“Visualizing MuZero Models”, Et Al 2021
“Visualizing MuZero Models”, 2021-02-25 ( ; similar; bibliography)
“Combining Off and On-Policy Training in Model-Based Reinforcement Learning”, 2021
“Combining Off and On-Policy Training in Model-Based Reinforcement Learning”, 2021-02-24 (similar)
“Improving Model-Based Reinforcement Learning With Internal State Representations through Self-Supervision”, Et Al 2021
“Improving Model-Based Reinforcement Learning with Internal State Representations through Self-Supervision”, 2021-02-10 (similar)
“On the Role of Planning in Model-based Deep Reinforcement Learning”, Et Al 2020
“On the role of planning in model-based deep reinforcement learning”, 2020-11-08 (similar)
“The Value Equivalence Principle for Model-Based Reinforcement Learning”, Et Al 2020
“The Value Equivalence Principle for Model-Based Reinforcement Learning”, 2020-11-06 ( ; similar; bibliography)
“Measuring Progress in Deep Reinforcement Learning Sample Efficiency”, 2020
“Measuring Progress in Deep Reinforcement Learning Sample Efficiency”, 2020-09-28 ( ; similar)
“Monte-Carlo Tree Search As Regularized Policy Optimization”, Et Al 2020
“Monte-Carlo Tree Search as Regularized Policy Optimization”, 2020-07-24 ( ; similar)
“Continuous Control for Searching and Planning With a Learned Model”, Et Al 2020
“Continuous Control for Searching and Planning with a Learned Model”, 2020-06-12 (similar; bibliography)
“Agent57: Outperforming the Human Atari Benchmark”, Et Al 2020
“Agent57: Outperforming the human Atari benchmark”, 2020-03-31 ( ; backlinks; similar; bibliography)
“MuZero: Mastering Atari, Go, Chess and Shogi by Planning With a Learned Model”, Et Al 2019
“MuZero: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model”, 2019-11-19 ( ; similar)
“Surprising Negative Results for Generative Adversarial Tree Search”, Et Al 2018
“Surprising Negative Results for Generative Adversarial Tree Search”, 2018-06-15 ( ; similar)
“TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning”, Et Al 2017
“TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning”, 2017-10-31 (backlinks; similar; bibliography)
Wikipedia
Miscellaneous
Link Bibliography
-
https://arxiv.org/abs/2206.05314#deepmind
: “Large-Scale Retrieval for Reinforcement Learning”, Peter C. Humphreys, Arthur Guez, Olivier Tieleman, Laurent Sifre, Théophane Weber, Timothy Lillicrap: -
https://openreview.net/forum?id=0ZbPmmB61g#google
: “Boosting Search Engines With Interactive Agents”, : -
https://openreview.net/forum?id=bERaNdoegnO#deepmind
: “Policy Improvement by Planning With Gumbel”, Ivo Danihelka, Arthur Guez, Julian Schrittwieser, David Silver: -
https://arxiv.org/abs/2111.01587#deepmind
: “Procedural Generalization by Planning With Self-Supervised World Models”, : -
https://arxiv.org/abs/2106.10316#deepmind
: “Proper Value Equivalence”, Christopher Grimm, André Barreto, Gregory Farquhar, David Silver, Satinder Singh: -
https://arxiv.org/abs/2104.06294#deepmind
: “MuZero Unplugged: Online and Offline Reinforcement Learning by Planning With a Learned Model”, Julian Schrittwieser, Thomas Hubert, Amol Mandhane, Mohammadamin Barekatain, Ioannis Antonoglou, David Silver: -
https://arxiv.org/abs/2104.06272#deepmind
: “Podracer Architectures for Scalable Reinforcement Learning”, Matteo Hessel, Manuel Kroiss, Aidan Clark, Iurii Kemaev, John Quan, Thomas Keck, Fabio Viola, Hado van Hasselt: -
https://arxiv.org/abs/2102.12924
: “Visualizing MuZero Models”, Joery A. de Vries, Ken S. Voskuil, Thomas M. Moerland, Aske Plaat: -
https://arxiv.org/abs/2011.03506#deepmind
: “The Value Equivalence Principle for Model-Based Reinforcement Learning”, Christopher Grimm, André Barreto, Satinder Singh, David Silver: -
https://arxiv.org/abs/2006.07430
: “Continuous Control for Searching and Planning With a Learned Model”, Xuxi Yang, Werner Duvaud, Peng Wei: -
https://www.deepmind.com/blog/agent57-outperforming-the-human-atari-benchmark
: “Agent57: Outperforming the Human Atari Benchmark”, Adrià Puigdomènech, Bilal Piot, Steven Kapturowski, Pablo Sprechmann, Alex Vitvitskyi, Daniel Guo, Charles Blundell: -
https://arxiv.org/abs/1710.11417
: “TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning”, Gregory Farquhar, Tim Rocktäschel, Maximilian Igl, Shimon Whiteson: