‘hidden-information game’ directory

Gwern

‘hidden-information game’ directory

Links

“BetaZero: Belief-State Planning for Long-Horizon POMDPs Using Learned Approximations ”, Moss et al 2023

BetaZero: Belief-State Planning for Long-Horizon POMDPs using Learned Approximations⁠

“Posterior Sampling for Multi-Agent Reinforcement Learning: Solving Extensive Games With Imperfect Information ”, Zhou et al 2023

Posterior sampling for multi-agent reinforcement learning: solving extensive games with imperfect information⁠

“AlphaZe∗∗: AlphaZero-Like Baselines for Imperfect Information Games Are Surprisingly Strong ”, Blüml et al 2023

AlphaZe∗∗: AlphaZero-like baselines for imperfect information games are surprisingly strong⁠

“DeepNash: Mastering the Game of Stratego With Model-Free Multiagent Reinforcement Learning ”, Perolat et al 2022

DeepNash: Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning⁠

“DouZero: Mastering DouDizhu With Self-Play Deep Reinforcement Learning ”, Zha et al 2021

DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning⁠

“Vector Quantized Models for Planning ”, Ozair et al 2021

Vector Quantized Models for Planning⁠

“Suphx: Mastering Mahjong With Deep Reinforcement Learning ”, Li et al 2020

Suphx: Mastering Mahjong with Deep Reinforcement Learning⁠

“From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization ”, Perolat et al 2020

From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization⁠

“Finding Friend and Foe in Multi-Agent Games ”, Serrino et al 2019

Finding Friend and Foe in Multi-Agent Games⁠

“Monte Carlo Neural Fictitious Self-Play: Approach to Approximate Nash Equilibrium of Imperfect-Information Games ”, Zhang et al 2019

Monte Carlo Neural Fictitious Self-Play: Approach to Approximate Nash equilibrium of Imperfect-Information Games⁠

“A Survey and Critique of Multiagent Deep Reinforcement Learning ”, Hernandez-Leal et al 2018

A Survey and Critique of Multiagent Deep Reinforcement Learning⁠

“Solving Imperfect-Information Games via Discounted Regret Minimization ”, Brown & Sandholm 2018

Solving Imperfect-Information Games via Discounted Regret Minimization⁠

“ExIt-OOS: Towards Learning from Planning in Imperfect Information Games ”, Kitchen & Benedetti 2018

ExIt-OOS: Towards Learning from Planning in Imperfect Information Games⁠

“Regret Minimization for Partially Observable Deep Reinforcement Learning ”, Jin et al 2017

Regret Minimization for Partially Observable Deep Reinforcement Learning⁠

“LADDER: A Human-Level Bidding Agent for Large-Scale Real-Time Online Auctions ”, Wang et al 2017

LADDER: A Human-Level Bidding Agent for Large-Scale Real-Time Online Auctions⁠

“Deep Recurrent Q-Learning for Partially Observable MDPs ”, Hausknecht & Stone 2015

Deep Recurrent Q-Learning for Partially Observable MDPs⁠

“Monte-Carlo Planning in Large POMDPs ”, Silver & Veness 2010

Monte-Carlo Planning in Large POMDPs⁠

“One Writer Enters International Competition to Play the World-Conquering Game That Redefines What It Means to Be a Geek (And a Person) ”

One writer enters international competition to play the world-conquering game that redefines what it means to be a geek (and a person)

“So Has AI Conquered Bridge? ”

⁠So has AI conquered Bridge?⁠ :

View External Link:

⁠https://www.lesswrong.com/posts/yHxmJch8dJoH6dwwz/so-has-ai-conquered-bridge⁠

“The Steely, Headless King of Texas Hold’Em ”

The Steely, Headless King of Texas Hold’Em⁠

“Artificial Intelligence Beats Eight World Champions at Bridge ”

Artificial intelligence beats eight world champions at bridge⁠

“A Poker-Playing Robot Goes to Work for the Pentagon ”

⁠A Poker-Playing Robot Goes to Work for the Pentagon⁠ :

View External Link:

⁠https://www.wired.com/story/poker-playing-robot-goes-to-pentagon/⁠

Wikipedia

Scotland Yard⁠ :

https://en.wikipedia.org/wiki/Scotland_Yard_(board_game)⁠

Miscellaneous

Bibliography

https://arxiv.org/abs/2206.15378#deepmind: “DeepNash: Mastering the Game of Stratego With Model-Free Multiagent Reinforcement Learning ”⁠, Julien Perolat, Bart de Vylder, Daniel Hennes …, Eugene Tarassov, Florian Strub, Vincent de Boer, Paul Muller, Jerome T. Connor, Neil Burch, Thomas Anthony, Stephen McAleer, Romuald Elie, Sarah H. Cen, Zhe Wang, Audrunas Gruslys, Aleksandra Malysheva, Mina Khan, Sherjil Ozair, Finbarr Timbers, Toby Pohlen, Tom Eccles⁠, Mark Rowland⁠, Marc Lanctot, Jean-Baptiste Lespiau, Bilal Piot, Shayegan Omidshafiei, Edward Lockhart, Laurent Sifre⁠, Nathalie Beauguerlange, Remi Munos, David Silver⁠, Satinder Singh⁠, Demis Hassabis⁠, Karl Tuyls
link-bibliography⁠
https://arxiv.org/abs/2106.04615#deepmind: “Vector Quantized Models for Planning ”⁠, Sherjil Ozair, Yazhe Li, Ali Razavi …, Ioannis Antonoglou, Aäron van den Oord, Oriol Vinyals⁠
link-bibliography⁠
2010-silver.pdf: “Monte-Carlo Planning in Large POMDPs ”⁠, David Silver⁠, ⁠Joel Veness⁠
link-bibliography⁠