Artificial Intelligence for Retrosynthetic Planning Needs Both Data and Expert Knowledge
Gold-Medalist Coders Build an AI That Can Do Their Job for Them: A new startup called Cognition AI can turn a user’s prompt into a website or video game
Beyond A✱: Better Planning with Transformers via Search Dynamics Bootstrapping (Searchformer)
Everything of Thoughts: Defying the Law of Penrose Triangle for Thought Generation
Bridging the Human-AI Knowledge Gap: Concept Discovery and Transfer in AlphaZero
Diversifying AI: Towards Creative Chess with AlphaZero (AZdb)
Self-play reinforcement learning guides protein engineering
BetaZero: Belief-State Planning for Long-Horizon POMDPs using Learned Approximations
Who Will You Be After ChatGPT Takes Your Job? Generative AI is coming for white-collar roles. If your sense of worth comes from work—what’s left to hold on to?
AlphaZe∗∗: AlphaZero-like baselines for imperfect information games are surprisingly strong
Solving math word problems with process & outcome-based feedback
Are AlphaZero-like Agents Robust to Adversarial Perturbations?
Newton’s method for reinforcement learning and model predictive control
CrossBeam: Learning to Search in Bottom-Up Program Synthesis
Evaluating model-based planning and planner amortization for continuous control
Scalable Online Planning via Reinforcement Learning Fine-Tuning
Lessons from AlphaZero for Optimal, Model Predictive, and Adaptive Control
How Does AI Improve Human Decision-Making? Evidence from the AI-Powered Go Program
Train on Small, Play the Large: Scaling Up Board Games with AlphaZero and GNN
Neural Tree Expansion for Multi-Robot Planning in Non-Cooperative Environments
OLIVAW: Mastering Othello without Human Knowledge, nor a Fortune
Transfer of Fully Convolutional Policy-Value Networks Between Games and Game Variants
Learning to Stop: Dynamic Simulation Monte-Carlo Tree Search
Assessing Game Balance with AlphaZero: Exploring Alternative Rule Sets in Chess
Learning Compositional Neural Programs for Continuous Control
ReBeL: Combining Deep Reinforcement Learning and Search for Imperfect-Information Games
Monte-Carlo Tree Search as Regularized Policy Optimization
Tackling Morpion Solitaire with AlphaZero-like Ranked Reward Reinforcement Learning
Aligning Superhuman AI with Human Behavior: Chess as a Model System
Approximate exploitability: Learning a best response in large games
Accelerating and Improving AlphaZero Using Population Based Training
(Yonhap Interview) Go master Lee says he quits—unable to win over AI Go players
MuZero: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
Global optimization of quantum dynamics with AlphaZero deep exploration
Learning Compositional Neural Programs with Recursive Tree Search and Planning
π-IW: Deep Policies for Width-Based Planning in Pixel Domains
Policy Gradient Search: Online Planning and Expert Iteration without Search Trees
AlphaX: eXploring Neural Architectures with Deep Neural Networks and Monte Carlo Tree Search
Minigo: A Case Study in Reproducing Reinforcement Learning Research
ELF OpenGo: An Analysis and Open Reimplementation of AlphaZero
A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play
AlphaSeq: Sequence Discovery with Deep Reinforcement Learning
ExIt-OOS: Towards Learning from Planning in Imperfect Information Games
Surprising Negative Results for Generative Adversarial Tree Search
Feature-Based Aggregation and Deep Reinforcement Learning: A Survey and Some New Implementations
Sim-to-Real Optimization of Complex Real World Mobile Network with Imperfect Information via Deep Reinforcement Learning from Self-play
M-Walk: Learning to Walk over Graphs using Monte Carlo Tree Search
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
AlphaGo Zero: Mastering the game of Go without human knowledge
DeepMind’s latest AI breakthrough is its most important yet: Google-owned DeepMind’s Go-playing artificial intelligence can now learn without human help… or data
Learning Generalized Reactive Policies using Deep Neural Networks
DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker
Mastering the game of Go with deep neural networks and tree search
Reinforcement Learning As Classification: Leveraging Modern Classifiers
Learning From Scratch by Thinking Fast and Slow With Deep Learning and Tree Search
Trading Off Compute in Training and Inference § MCTS Scaling
An Open-Source Implementation of the AlphaGoZero Algorithm
Reading the Tea Leaves: Expert End-Users Explaining the Unexplainable
2023-zahavy-figure7-scalingofchesspuzzlesolutionswithmultiplealphazeroagentsandsimulations.png
2022-humphreys-figure2-retrievalaugmentedmuzerogoagentarchitecture.jpg
2022-mcgrath-figure4-alphazerolearningofhumanchessconceptsovertraininghistory.png
2022-mcgrath-figure5-a-alphazerovshumanprofessionalopeningmoveoverhistory.png
2022-mcgrath-figure5-b-alphazeroopeningmoveovertraininghistory.png
2022-mcgrath-figure7-alphazerorapidlydiscoveriesbasicchessopenings.png
2021-choi-figure2-globalgoplayerimprovementduetoleelarelease.jpg
2017-silver-figure3b-alphagozeropredictionofhumanexpertgomovesvssuperhumanlyaccuratepredictions.png
2017-silver-figure6-performanceofalphagozerolearningcurvesandbyelocomparison.jpg
http://cl-informatik.uibk.ac.at/cek/holstep/ckfccs-holstep-submitted.pdf
https://ai.facebook.com/blog/open-sourcing-polygames-a-new-framework-for-training-ai-bots-through-self-play/
https://cacm.acm.org/magazines/2021/9/255049-playing-with-and-against-computers/abstract
https://lczero.org/blog/2024/02/how-well-do-lc0-networks-compare-to-the-greatest-transformer-network-from-deepmind/
https://proceedings.neurips.cc/paper/2014/file/8bb88f80d334b1869781beb89f7b73be-Paper.pdf
https://research.google/blog/leveraging-machine-learning-for-game-development/
https://www.deepmind.com/blog/alphazero-shedding-new-light-grand-games-chess-shogi-and-go/
https://www.deepmind.com/blog/exploring-mysteries-alphago/
https://www.lesswrong.com/posts/FF8i6SLfKb4g7C4EL/inside-the-mind-of-a-superhuman-go-model-how-does-leela-zero-2
https://www.nature.com/articles/s41598-019-45619-9#deepmind
https://www.reddit.com/r/MachineLearning/comments/76xjb5/ama_we_are_david_silver_and_julian_schrittwieser/
https://www.reddit.com/r/MachineLearning/comments/rdb1uw/p_utttai_alphazerolike_solution_for_playing/
https://www.reddit.com/r/baduk/comments/qqjw64/shin_jinseo_ai_difference_shrinking/
Artificial Intelligence for Retrosynthetic Planning Needs Both Data and Expert Knowledge
%252Fdoc%252Freinforcement-learning%252Fmodel%252Falphago%252F2024-striethkalthoff.pdf.html
Gold-Medalist Coders Build an AI That Can Do Their Job for Them: A new startup called Cognition AI can turn a user’s prompt into a website or video game
https%253A%252F%252Fwww.bloomberg.com%252Fnews%252Farticles%252F2024-03-12%252Fcognition-ai-is-a-peter-thiel-backed-coding-assistant.html
Bridging the Human-AI Knowledge Gap: Concept Discovery and Transfer in AlphaZero
https%253A%252F%252Farxiv.org%252Fabs%252F2310.16410%2523deepmind.html
Diversifying AI: Towards Creative Chess with AlphaZero (AZdb)
https%253A%252F%252Farxiv.org%252Fabs%252F2308.09175%2523deepmind.html
Who Will You Be After ChatGPT Takes Your Job? Generative AI is coming for white-collar roles. If your sense of worth comes from work—what’s left to hold on to?
https%253A%252F%252Fwww.wired.com%252Fstory%252Fstatus-work-generative-artificial-intelligence%252F.html
Are AlphaZero-like Agents Robust to Adversarial Perturbations?
https%253A%252F%252Farxiv.org%252Fabs%252F2206.05314%2523deepmind.html
https%253A%252F%252Farxiv.org%252Fabs%252F2205.11491%2523facebook.html
https%253A%252F%252Fopenreview.net%252Fforum%253Fid%253DbERaNdoegnO%2523deepmind.html
https%253A%252F%252Farxiv.org%252Fabs%252F2202.01344%2523openai.html
https%253A%252F%252Farxiv.org%252Fabs%252F2112.03178%2523deepmind.html
https%253A%252F%252Farxiv.org%252Fabs%252F2111.09259%2523deepmind.html
Assessing Game Balance with AlphaZero: Exploring Alternative Rule Sets in Chess
https%253A%252F%252Farxiv.org%252Fabs%252F2009.04374%2523deepmind.html
Accelerating and Improving AlphaZero Using Population Based Training
AlphaGo Zero: Mastering the game of Go without human knowledge
%252Fdoc%252Freinforcement-learning%252Fmodel%252Falphago%252F2017-silver.pdf%2523deepmind.html
Wikipedia Bibliography: