- See Also
-
Links
- “AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning”, Mathieu et al 2023
- “Job Hunt As a PhD in RL: How It Actually Happens § Reinforcement Learning Reflections”, Lambert 2022
- “Large-Scale Retrieval for Reinforcement Learning”, Humphreys et al 2022
- “Boosting Search Engines With Interactive Agents”, Ciaramita et al 2022
- “Stochastic MuZero: Planning in Stochastic Environments With a Learned Model”, Antonoglou et al 2022
- “Policy Improvement by Planning With Gumbel”, Danihelka et al 2022
- “MuZero With Self-competition for Rate Control in VP9 Video Compression”, Mandhane et al 2022
- “Procedural Generalization by Planning With Self-Supervised World Models”, Anand et al 2021
- “Mastering Atari Games With Limited Data”, Ye et al 2021
- “Proper Value Equivalence”, Grimm et al 2021
- “Vector Quantized Models for Planning”, Ozair et al 2021
- “Learning and Planning in Complex Action Spaces”, Hubert et al 2021
- “MuZero Unplugged: Online and Offline Reinforcement Learning by Planning With a Learned Model”, Schrittwieser et al 2021
- “Podracer Architectures for Scalable Reinforcement Learning”, Hessel et al 2021
- “Muesli: Combining Improvements in Policy Optimization”, Hessel et al 2021
- “Scaling Scaling Laws With Board Games”, Jones 2021
- “Playing Nondeterministic Games through Planning With a Learned Model”, Willkens & Pollack 2021
- “Visualizing MuZero Models”, Vries et al 2021
- “Combining Off and On-Policy Training in Model-Based Reinforcement Learning”, Borges & Oliveira 2021
- “Improving Model-Based Reinforcement Learning With Internal State Representations through Self-Supervision”, Scholz et al 2021
- “On the Role of Planning in Model-based Deep Reinforcement Learning”, Hamrick et al 2020
- “The Value Equivalence Principle for Model-Based Reinforcement Learning”, Grimm et al 2020
- “Measuring Progress in Deep Reinforcement Learning Sample Efficiency”, Anonymous 2020
- “Monte-Carlo Tree Search As Regularized Policy Optimization”, Grill et al 2020
- “Continuous Control for Searching and Planning With a Learned Model”, Yang et al 2020
- “Agent57: Outperforming the Human Atari Benchmark”, Puigdomènech et al 2020
- “MuZero: Mastering Atari, Go, Chess and Shogi by Planning With a Learned Model”, Schrittwieser et al 2019
- “Surprising Negative Results for Generative Adversarial Tree Search”, Azizzadenesheli et al 2018
- “TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning”, Farquhar et al 2017
- Sort By Magic
- Wikipedia
- Miscellaneous
- Link Bibliography
See Also
Links
“AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning”, Mathieu et al 2023
“AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning”
“Job Hunt As a PhD in RL: How It Actually Happens § Reinforcement Learning Reflections”, Lambert 2022
“Job Hunt as a PhD in RL: How it Actually Happens § Reinforcement learning reflections”
“Large-Scale Retrieval for Reinforcement Learning”, Humphreys et al 2022
“Boosting Search Engines With Interactive Agents”, Ciaramita et al 2022
“Stochastic MuZero: Planning in Stochastic Environments With a Learned Model”, Antonoglou et al 2022
“Stochastic MuZero: Planning in Stochastic Environments with a Learned Model”
“Policy Improvement by Planning With Gumbel”, Danihelka et al 2022
“MuZero With Self-competition for Rate Control in VP9 Video Compression”, Mandhane et al 2022
“MuZero with Self-competition for Rate Control in VP9 Video Compression”
“Procedural Generalization by Planning With Self-Supervised World Models”, Anand et al 2021
“Procedural Generalization by Planning with Self-Supervised World Models”
“Mastering Atari Games With Limited Data”, Ye et al 2021
“Proper Value Equivalence”, Grimm et al 2021
“Vector Quantized Models for Planning”, Ozair et al 2021
“Learning and Planning in Complex Action Spaces”, Hubert et al 2021
“MuZero Unplugged: Online and Offline Reinforcement Learning by Planning With a Learned Model”, Schrittwieser et al 2021
“MuZero Unplugged: Online and Offline Reinforcement Learning by Planning with a Learned Model”
“Podracer Architectures for Scalable Reinforcement Learning”, Hessel et al 2021
“Podracer architectures for scalable Reinforcement Learning”
“Muesli: Combining Improvements in Policy Optimization”, Hessel et al 2021
“Scaling Scaling Laws With Board Games”, Jones 2021
“Playing Nondeterministic Games through Planning With a Learned Model”, Willkens & Pollack 2021
“Playing Nondeterministic Games through Planning with a Learned Model”
“Visualizing MuZero Models”, Vries et al 2021
“Combining Off and On-Policy Training in Model-Based Reinforcement Learning”, Borges & Oliveira 2021
“Combining Off and On-Policy Training in Model-Based Reinforcement Learning”
“Improving Model-Based Reinforcement Learning With Internal State Representations through Self-Supervision”, Scholz et al 2021
“On the Role of Planning in Model-based Deep Reinforcement Learning”, Hamrick et al 2020
“On the role of planning in model-based deep reinforcement learning”
“The Value Equivalence Principle for Model-Based Reinforcement Learning”, Grimm et al 2020
“The Value Equivalence Principle for Model-Based Reinforcement Learning”
“Measuring Progress in Deep Reinforcement Learning Sample Efficiency”, Anonymous 2020
“Measuring Progress in Deep Reinforcement Learning Sample Efficiency”
“Monte-Carlo Tree Search As Regularized Policy Optimization”, Grill et al 2020
“Monte-Carlo Tree Search as Regularized Policy Optimization”
“Continuous Control for Searching and Planning With a Learned Model”, Yang et al 2020
“Continuous Control for Searching and Planning with a Learned Model”
“Agent57: Outperforming the Human Atari Benchmark”, Puigdomènech et al 2020
“MuZero: Mastering Atari, Go, Chess and Shogi by Planning With a Learned Model”, Schrittwieser et al 2019
“MuZero: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model”
“Surprising Negative Results for Generative Adversarial Tree Search”, Azizzadenesheli et al 2018
“Surprising Negative Results for Generative Adversarial Tree Search”
“TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning”, Farquhar et al 2017
“TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning”
Sort By Magic
Annotations sorted by machine learning into inferred 'tags'. This provides an alternative way to browse: instead of by date order, one can browse in topic order. The 'sorted' list has been automatically clustered into multiple sections & auto-labeled for easier browsing.
Beginning with the newest annotation, it uses the embedding of each annotation to attempt to create a list of nearest-neighbor annotations, creating a progression of topics. For more details, see the link.
offline-learning
model-planning
reinforcement-learning-model-based
Wikipedia
Miscellaneous
Link Bibliography
-
https://arxiv.org/abs/2206.05314#deepmind
: “Large-Scale Retrieval for Reinforcement Learning”, Peter C. Humphreys, Arthur Guez, Olivier Tieleman, Laurent Sifre, Théophane Weber, Timothy Lillicrap -
https://openreview.net/forum?id=0ZbPmmB61g#google
: “Boosting Search Engines With Interactive Agents”, -
https://openreview.net/forum?id=bERaNdoegnO#deepmind
: “Policy Improvement by Planning With Gumbel”, Ivo Danihelka, Arthur Guez, Julian Schrittwieser, David Silver -
https://arxiv.org/abs/2111.01587#deepmind
: “Procedural Generalization by Planning With Self-Supervised World Models”, -
https://arxiv.org/abs/2106.10316#deepmind
: “Proper Value Equivalence”, Christopher Grimm, André Barreto, Gregory Farquhar, David Silver, Satinder Singh -
https://arxiv.org/abs/2106.04615#deepmind
: “Vector Quantized Models for Planning”, Sherjil Ozair, Yazhe Li, Ali Razavi, Ioannis Antonoglou, Aäron van den Oord, Oriol Vinyals -
https://arxiv.org/abs/2104.06294#deepmind
: “MuZero Unplugged: Online and Offline Reinforcement Learning by Planning With a Learned Model”, Julian Schrittwieser, Thomas Hubert, Amol Mandhane, Mohammadamin Barekatain, Ioannis Antonoglou, David Silver -
https://arxiv.org/abs/2104.06272#deepmind
: “Podracer Architectures for Scalable Reinforcement Learning”, Matteo Hessel, Manuel Kroiss, Aidan Clark, Iurii Kemaev, John Quan, Thomas Keck, Fabio Viola, Hado van Hasselt -
https://arxiv.org/abs/2102.12924
: “Visualizing MuZero Models”, Joery A. de Vries, Ken S. Voskuil, Thomas M. Moerland, Aske Plaat -
https://arxiv.org/abs/2011.03506#deepmind
: “The Value Equivalence Principle for Model-Based Reinforcement Learning”, Christopher Grimm, André Barreto, Satinder Singh, David Silver -
https://arxiv.org/abs/2006.07430
: “Continuous Control for Searching and Planning With a Learned Model”, Xuxi Yang, Werner Duvaud, Peng Wei -
https://www.deepmind.com/blog/agent57-outperforming-the-human-atari-benchmark
: “Agent57: Outperforming the Human Atari Benchmark”, Adrià Puigdomènech, Bilal Piot, Steven Kapturowski, Pablo Sprechmann, Alex Vitvitskyi, Daniel Guo, Charles Blundell -
https://arxiv.org/abs/1710.11417
: “TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning”, Gregory Farquhar, Tim Rocktäschel, Maximilian Igl, Shimon Whiteson