- See Also
- Gwern
-
Links
- “On Scalable Oversight With Weak LLMs Judging Strong LLMs”, Kenton et al 2024
- “Foundational Challenges in Assuring Alignment and Safety of Large Language Models”, Anwar et al 2024
- “From Reinforcement Learning to Agency: Frameworks for Understanding Basal Cognition”, Seifert et al 2024
- “Classical Sorting Algorithms As a Model of Morphogenesis: Self-Sorting Arrays Reveal Unexpected Competencies in a Minimal Model of Basal Intelligence”, Zhang et al 2023
- “PRER: Modeling Complex Mathematical Reasoning via Large Language Model Based MathAgent”, Liao et al 2023
- “Generative Agent-Based Modeling With Actions Grounded in Physical, Social, or Digital Space Using Concordia”, Vezhnevets et al 2023
- “Learning Few-Shot Imitation As Cultural Transmission”, Bhoopchand et al 2023
- “JaxMARL: Multi-Agent RL Environments in JAX”, Rutherford et al 2023
- “Large Language Models Can Strategically Deceive Their Users When Put Under Pressure”, Scheurer et al 2023
- “Neural MMO 2.0: A Massively Multi-Task Addition to Massively Multi-Agent Learning”, Suárez et al 2023
- “Let Models Speak Ciphers: Multiagent Debate through Embeddings”, Pham et al 2023
- “AI Deception: A Survey of Examples, Risks, and Potential Solutions”, Park et al 2023
- “Diversifying AI: Towards Creative Chess With AlphaZero (AZdb)”, Zahavy et al 2023
- “Hoodwinked: Deception and Cooperation in a Text-Based Game for Language Models”, O’Gara 2023
- “Combining Human Expertise With Artificial Intelligence: Experimental Evidence from Radiology”, Agarwal et al 2023
- “Posterior Sampling for Multi-Agent Reinforcement Learning: Solving Extensive Games With Imperfect Information”, Zhou et al 2023
- “Reinforcement Learning in Newcomb-Like Environments”, Bell et al 2023
- “Learning Agile Soccer Skills for a Bipedal Robot With Deep Reinforcement Learning”, Haarnoja et al 2023
- “Multi-Party Chat (MultiLIGHT): Conversational Agents in Group Settings With Humans and Models”, Wei et al 2023
- “Off-The-Grid MARL (OG-MARL): Datasets With Baselines for Offline Multi-Agent Reinforcement Learning”, Formanek et al 2023
- “Learning to Control and Coordinate Mixed Traffic Through Robot Vehicles at Complex and Unsignalized Intersections”, Wang et al 2023
- “Melting Pot 2.0”, Agapiou et al 2022
- “CICERO: Human-Level Play in the Game of Diplomacy by Combining Language Models With Strategic Reasoning”, Bakhtin et al 2022
- “Over-Communicate No More: Situated RL Agents Learn Concise Communication Protocols”, Kalinowska et al 2022
- “Human-AI Coordination via Human-Regularized Search and Learning”, Hu et al 2022
- “Game Theoretic Rating in N-Player General-Sum Games With Equilibria”, Marris et al 2022
- “Modeling Bounded Rationality in Multi-Agent Simulations Using Rationally Inattentive Reinforcement Learning”, Anonymous 2022
- “Neural Payoff Machines: Predicting Fair and Stable Payoff Allocations Among Team Members”, Cornelisse et al 2022
- “Social Simulacra: Creating Populated Prototypes for Social Computing Systems”, Park et al 2022
- “DeepNash: Mastering the Game of Stratego With Model-Free Multiagent Reinforcement Learning”, Perolat et al 2022
- “Fleet-DAgger: Interactive Robot Fleet Learning With Scalable Human Supervision”, Hoque et al 2022
- “Revisiting Some Common Practices in Cooperative Multi-Agent Reinforcement Learning”, Fu et al 2022
- “MAT: Multi-Agent Reinforcement Learning Is a Sequence Modeling Problem”, Wen et al 2022
- “First Contact: Unsupervised Human-Machine Co-Adaptation via Mutual Information Maximization”, Reddy et al 2022
- “Emergent Bartering Behavior in Multi-Agent Reinforcement Learning”, Johanson et al 2022
- “NeuPL: Neural Population Learning”, Liu et al 2022
- “Uncalibrated Models Can Improve Human-AI Collaboration”, Vodrahalli et al 2022
- “Human-Centered Mechanism Design With Democratic AI”, Koster et al 2022
- “Hidden Agenda: a Social Deduction Game With Diverse Learned Equilibria”, Kopparapu et al 2022
- “Finding General Equilibria in Many-Agent Economic Simulations Using Deep Reinforcement Learning”, Curry et al 2022
- “Maximum Entropy Population Based Training for Zero-Shot Human-AI Coordination”, Zhao et al 2021
- “Modeling Strong and Human-Like Gameplay With KL-Regularized Search”, Jacob et al 2021
- “Offline Pre-Trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks”, Meng et al 2021
- “Player of Games”, Schmid et al 2021
- “Collective Intelligence for Deep Learning: A Survey of Recent Developments”, Ha & Tang 2021
- “Learning to Ground Multi-Agent Communication With Autoencoders”, Lin et al 2021
- “Meta-Learning, Social Cognition and Consciousness in Brains and Machines”, Langdon et al 2021
- “Collaborating With Humans without Human Data”, Strouse et al 2021
- “The Neural MMO Platform for Massively Multiagent Research”, Suarez et al 2021
- “Replay-Guided Adversarial Environment Design”, Jiang et al 2021
- “DORA: No-Press Diplomacy from Scratch”, Bakhtin et al 2021
- “Embodied Intelligence via Learning and Evolution”, Gupta et al 2021
- “Trust Region Policy Optimization in Multi-Agent Reinforcement Learning”, Kuba et al 2021
- “WarpDrive: Extremely Fast End-To-End Deep Multi-Agent Reinforcement Learning on a GPU”, Lan et al 2021
- “The AI Economist: Optimal Economic Policy Design via Two-Level Deep Reinforcement Learning”, Zheng et al 2021
- “Open-Ended Learning Leads to Generally Capable Agents”, Team et al 2021
- “Megaverse: Simulating Embodied Agents at One Million Experiences per Second”, Petrenko et al 2021
- “Scalable Evaluation of Multi-Agent Reinforcement Learning With Melting Pot”, Leibo et al 2021
- “From Motor Control to Team Play in Simulated Humanoid Football”, Liu et al 2021
- “Cooperative AI Foundation (CAIF)”, CAIF 2021
- “Baller2vec++: A Look-Ahead Multi-Entity Transformer For Modeling Coordinated Agents”, Alcorn & Nguyen 2021
- “Neural Tree Expansion for Multi-Robot Planning in Non-Cooperative Environments”, Riviere et al 2021
- “Multitasking Inhibits Semantic Drift”, Jacob et al 2021
- “Asymmetric Self-Play for Automatic Goal Discovery in Robotic Manipulation”, OpenAI et al 2021
- “Reinforcement Learning for Datacenter Congestion Control”, Tessler et al 2021
- “Baller2vec: A Multi-Entity Transformer For Multi-Agent Spatiotemporal Modeling”, Alcorn & Nguyen 2021
- “UPDeT: Universal Multi-Agent Reinforcement Learning via Policy Decoupling With Transformers”, Hu et al 2021
- “Imitating Interactive Intelligence”, Abramson et al 2020
- “Towards Playing Full MOBA Games With Deep Reinforcement Learning”, Ye et al 2020
- “TLeague: A Framework for Competitive Self-Play Based Distributed Multi-Agent Reinforcement Learning”, Sun et al 2020
- “Emergent Road Rules In Multi-Agent Driving Environments”, Pal et al 2020
- “Reinforcement Learning for Optimization of COVID-19 Mitigation Policies”, Kompella et al 2020
- “Human-Level Performance in No-Press Diplomacy via Equilibrium Search”, Gray et al 2020
- “Emergent Social Learning via Multi-Agent Reinforcement Learning”, Ndousse et al 2020
- “Grounded Language Learning Fast and Slow”, Hill et al 2020
- “ReBeL: Combining Deep Reinforcement Learning and Search for Imperfect-Information Games”, Brown et al 2020
- “Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions [Blog]”, Chang & Kaushik 2020
- “One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control”, Huang et al 2020
- “Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions”, Chang et al 2020
- “Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks”, Papoudakis et al 2020
- “Learning to Play No-Press Diplomacy With Best Response Policy Iteration”, Anthony et al 2020
- “Real World Games Look Like Spinning Tops”, Czarnecki et al 2020
- “Approximate Exploitability: Learning a Best Response in Large Games”, Timbers et al 2020
- “Enhanced POET: Open-Ended Reinforcement Learning through Unbounded Invention of Learning Challenges and Their Solutions”, Wang et al 2020
- “Social Diversity and Social Preferences in Mixed-Motive Reinforcement Learning”, McKee et al 2020
- “Effective Diversity in Population Based Reinforcement Learning”, Parker-Holder et al 2020
- “Towards Learning Multi-Agent Negotiations via Self-Play”, Tang 2020
- “Smooth Markets: A Basic Mechanism for Organizing Gradient-Based Learners”, Balduzzi et al 2020
- “MicrobatchGAN: Stimulating Diversity With Multi-Adversarial Discrimination”, Mordido et al 2020
- “Learning by Cheating”, Chen et al 2019
- “Increasing Generality in Machine Learning through Procedural Content Generation”, Risi & Togelius 2019
- “Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms”, Zhang et al 2019
- “Grandmaster Level in StarCraft II Using Multi-Agent Reinforcement Learning”, Vinyals et al 2019
- “Multiplayer AlphaZero”, Petosa & Balch 2019
- “Stabilizing Generative Adversarial Networks: A Survey”, Wiatrak et al 2019
- “Emergent Tool Use From Multi-Agent Autocurricula”, Baker et al 2019
- “Emergent Tool Use from Multi-Agent Interaction § Surprising Behavior”, Baker et al 2019
- “Emergent Tool Use from Multi-Agent Interaction § Surprising Behavior”, Baker et al 2019
- “No Press Diplomacy: Modeling Multi-Agent Gameplay”, Paquette et al 2019
- “A Review of Cooperative Multi-Agent Deep Reinforcement Learning”, OroojlooyJadid & Hajinezhad 2019
- “Pluribus: Superhuman AI for Multiplayer Poker”, Brown & Sandholm 2019
- “Evolving the Hearthstone Meta”, Silva et al 2019
- “Evolutionary Implementation of Bayesian Computations”, Czégel et al 2019
- “Finding Friend and Foe in Multi-Agent Games”, Serrino et al 2019
- “Hierarchical Decision Making by Generating and Following Natural Language Instructions”, Hu et al 2019
- “ICML 2019 Notes”, Abel 2019
- “Human-Level Performance in 3D Multiplayer Games With Population-Based Reinforcement Learning”, Jaderberg et al 2019
- “AI-GAs: AI-Generating Algorithms, an Alternate Paradigm for Producing General Artificial Intelligence”, Clune 2019
- “Adversarial Policies: Attacking Deep Reinforcement Learning”, Gleave et al 2019
- “LIGHT: Learning to Speak and Act in a Fantasy Text Adventure Game”, Urbanek et al 2019
- “Α-Rank: Multi-Agent Evaluation by Evolution”, Omidshafiei et al 2019
- “Autocurricula and the Emergence of Innovation from Social Interaction: A Manifesto for Multi-Agent Intelligence Research”, Leibo et al 2019
- “Distilling Policy Distillation”, Czarnecki et al 2019
- “Hierarchical Reinforcement Learning for Multi-Agent MOBA Game”, Zhang et al 2019
- “Open-Ended Learning in Symmetric Zero-Sum Games”, Balduzzi et al 2019
- “Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions”, Wang et al 2019
- “Hierarchical Macro Strategy Model for MOBA Game AI”, Wu et al 2018
- “Continual Match Based Training in Pommerman: Technical Report”, Peng et al 2018
- “Malthusian Reinforcement Learning”, Leibo et al 2018
- “Stable Opponent Shaping in Differentiable Games”, Letcher et al 2018
- “Deep Counterfactual Regret Minimization”, Brown et al 2018
- “TarMAC: Targeted Multi-Agent Communication”, Das et al 2018
- “Graph Convolutional Reinforcement Learning”, Jiang et al 2018
- “Social Influence As Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning”, Jaques et al 2018
- “Deep Reinforcement Learning”, Li 2018
- “A Survey and Critique of Multiagent Deep Reinforcement Learning”, Hernandez-Leal et al 2018
- “Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation”, Nogueira et al 2018
- “Pommerman: A Multi-Agent Playground”, Resnick et al 2018
- “Fully Distributed Multi-Robot Collision Avoidance via Deep Reinforcement Learning for Safe and Efficient Navigation in Complex Scenarios”, Fan et al 2018
- “Human-Level Performance in First-Person Multiplayer Games With Population-Based Deep Reinforcement Learning”, Jaderberg et al 2018
- “Construction of Arbitrarily Strong Amplifiers of Natural Selection Using Evolutionary Graph Theory”, Pavlogiannis et al 2018
- “Adaptive Mechanism Design: Learning to Promote Cooperation”, Baumann et al 2018
- “Mix&Match—Agent Curricula for Reinforcement Learning”, Czarnecki et al 2018
- “Kickstarting Deep Reinforcement Learning”, Schmitt et al 2018
- “Machine Theory of Mind”, Rabinowitz et al 2018
- “Sim-To-Real Optimization of Complex Real World Mobile Network With Imperfect Information via Deep Reinforcement Learning from Self-Play”, Tan et al 2018
- “Trust-Aware Decision Making for Human-Robot Collaboration: Model Learning and Planning”, Chen et al 2018
- “Emergent Complexity via Multi-Agent Competition”, Bansal et al 2017
- “Learning With Opponent-Learning Awareness”, Foerster et al 2017
- “LADDER: A Human-Level Bidding Agent for Large-Scale Real-Time Online Auctions”, Wang et al 2017
- “CAN: Creative Adversarial Networks, Generating "Art" by Learning About Styles and Deviating from Style Norms”, Elgammal et al 2017
- “On Convergence and Stability of GANs”, Kodali et al 2017
- “Supervision via Competition: Robot Adversaries for Learning Tasks”, Pinto et al 2016
- “Policy Distillation”, Rusu et al 2015
- “Reflective Oracles: A Foundation for Classical Game Theory”, Fallenstein et al 2015
- “Homo Moralis-Preference Evolution Under Incomplete Information and Assortative Matching”, Alger & Weibull 2013
- “A Self-Coordinating Bus Route to Resist Bus Bunching”, III & Eisenstein 2012
- “Language Evolution in the Laboratory”, Scott-Phillips & Kirby 2010
- “If Multi-Agent Learning Is the Answer, What Is the Question?”, Shoham et al 2007
- “Market-Based Reinforcement Learning in Partially Observable Worlds”, Kwee et al 2001
- “Properties of the Bucket Brigade Algorithm”, Holland 1985
- “Computer-Aided Gas Pipeline Operation Using Genetic Algorithms And Rule Learning”, Goldberg 1983
- “Collaborating With Humans Requires Understanding Them”
- “The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games”
- “Efficient Large-Scale Fleet Management via Multi-Agent Deep Reinforcement Learning”
- “Generally Capable Agents Emerge from Open-Ended Play”
- “One Writer Enters International Competition to Play the World-Conquering Game That Redefines What It Means to Be a Geek (and a Person)”
- “Mimicking Evolution With Reinforcement Learning”
- “LLM Powered Autonomous Agents”
- “The Pommerman Team Competition Or: How We Learned to Stop Worrying and Love the Battle”
- “How DeepMind's Generally Capable Agents Were Trained”
- “How Much Compute Was Used to Train DeepMind's Generally Capable Agents?”
- “DeepMind: Generally Capable Agents Emerge from Open-Ended Play”
- “So Has AI Conquered Bridge?”
- “The Steely, Headless King of Texas Hold ’Em”
- “Artificial Intelligence Beats Eight World Champions at Bridge”
- “Open-Ended Learning Leads to Generally Capable Agents”
- Sort By Magic
- Wikipedia
- Miscellaneous
- Bibliography
See Also
Gwern
“Evolution As Backstop for Reinforcement Learning”, Gwern 2018
“Fashion Cycles”, Gwern 2021
Links
“On Scalable Oversight With Weak LLMs Judging Strong LLMs”, Kenton et al 2024
“Foundational Challenges in Assuring Alignment and Safety of Large Language Models”, Anwar et al 2024
Foundational Challenges in Assuring Alignment and Safety of Large Language Models
“From Reinforcement Learning to Agency: Frameworks for Understanding Basal Cognition”, Seifert et al 2024
From reinforcement learning to agency: Frameworks for understanding basal cognition
“Classical Sorting Algorithms As a Model of Morphogenesis: Self-Sorting Arrays Reveal Unexpected Competencies in a Minimal Model of Basal Intelligence”, Zhang et al 2023
“PRER: Modeling Complex Mathematical Reasoning via Large Language Model Based MathAgent”, Liao et al 2023
PRER: Modeling Complex Mathematical Reasoning via Large Language Model based MathAgent
“Generative Agent-Based Modeling With Actions Grounded in Physical, Social, or Digital Space Using Concordia”, Vezhnevets et al 2023
“Learning Few-Shot Imitation As Cultural Transmission”, Bhoopchand et al 2023
“JaxMARL: Multi-Agent RL Environments in JAX”, Rutherford et al 2023
“Large Language Models Can Strategically Deceive Their Users When Put Under Pressure”, Scheurer et al 2023
Large Language Models can Strategically Deceive their Users when Put Under Pressure
“Neural MMO 2.0: A Massively Multi-Task Addition to Massively Multi-Agent Learning”, Suárez et al 2023
Neural MMO 2.0: A Massively Multi-task Addition to Massively Multi-agent Learning
“Let Models Speak Ciphers: Multiagent Debate through Embeddings”, Pham et al 2023
Let Models Speak Ciphers: Multiagent Debate through Embeddings
“AI Deception: A Survey of Examples, Risks, and Potential Solutions”, Park et al 2023
AI Deception: A Survey of Examples, Risks, and Potential Solutions
“Diversifying AI: Towards Creative Chess With AlphaZero (AZdb)”, Zahavy et al 2023
Diversifying AI: Towards Creative Chess with AlphaZero (AZdb)
“Hoodwinked: Deception and Cooperation in a Text-Based Game for Language Models”, O’Gara 2023
Hoodwinked: Deception and Cooperation in a Text-Based Game for Language Models
“Combining Human Expertise With Artificial Intelligence: Experimental Evidence from Radiology”, Agarwal et al 2023
Combining Human Expertise with Artificial Intelligence: Experimental Evidence from Radiology
“Posterior Sampling for Multi-Agent Reinforcement Learning: Solving Extensive Games With Imperfect Information”, Zhou et al 2023
“Reinforcement Learning in Newcomb-Like Environments”, Bell et al 2023
“Learning Agile Soccer Skills for a Bipedal Robot With Deep Reinforcement Learning”, Haarnoja et al 2023
Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning
“Multi-Party Chat (MultiLIGHT): Conversational Agents in Group Settings With Humans and Models”, Wei et al 2023
Multi-Party Chat (MultiLIGHT): Conversational Agents in Group Settings with Humans and Models
“Off-The-Grid MARL (OG-MARL): Datasets With Baselines for Offline Multi-Agent Reinforcement Learning”, Formanek et al 2023
Off-the-Grid MARL (OG-MARL): Datasets with Baselines for Offline Multi-Agent Reinforcement Learning
“Learning to Control and Coordinate Mixed Traffic Through Robot Vehicles at Complex and Unsignalized Intersections”, Wang et al 2023
“Melting Pot 2.0”, Agapiou et al 2022
“CICERO: Human-Level Play in the Game of Diplomacy by Combining Language Models With Strategic Reasoning”, Bakhtin et al 2022
“Over-Communicate No More: Situated RL Agents Learn Concise Communication Protocols”, Kalinowska et al 2022
Over-communicate no more: Situated RL agents learn concise communication protocols
“Human-AI Coordination via Human-Regularized Search and Learning”, Hu et al 2022
Human-AI Coordination via Human-Regularized Search and Learning
“Game Theoretic Rating in N-Player General-Sum Games With Equilibria”, Marris et al 2022
Game Theoretic Rating in N-player general-sum games with Equilibria
“Modeling Bounded Rationality in Multi-Agent Simulations Using Rationally Inattentive Reinforcement Learning”, Anonymous 2022
“Neural Payoff Machines: Predicting Fair and Stable Payoff Allocations Among Team Members”, Cornelisse et al 2022
Neural Payoff Machines: Predicting Fair and Stable Payoff Allocations Among Team Members
“Social Simulacra: Creating Populated Prototypes for Social Computing Systems”, Park et al 2022
Social Simulacra: Creating Populated Prototypes for Social Computing Systems
“DeepNash: Mastering the Game of Stratego With Model-Free Multiagent Reinforcement Learning”, Perolat et al 2022
DeepNash: Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning
“Fleet-DAgger: Interactive Robot Fleet Learning With Scalable Human Supervision”, Hoque et al 2022
Fleet-DAgger: Interactive Robot Fleet Learning with Scalable Human Supervision
“Revisiting Some Common Practices in Cooperative Multi-Agent Reinforcement Learning”, Fu et al 2022
Revisiting Some Common Practices in Cooperative Multi-Agent Reinforcement Learning
“MAT: Multi-Agent Reinforcement Learning Is a Sequence Modeling Problem”, Wen et al 2022
MAT: Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
“First Contact: Unsupervised Human-Machine Co-Adaptation via Mutual Information Maximization”, Reddy et al 2022
First Contact: Unsupervised Human-Machine Co-Adaptation via Mutual Information Maximization
“Emergent Bartering Behavior in Multi-Agent Reinforcement Learning”, Johanson et al 2022
Emergent Bartering Behavior in Multi-Agent Reinforcement Learning
“NeuPL: Neural Population Learning”, Liu et al 2022
“Uncalibrated Models Can Improve Human-AI Collaboration”, Vodrahalli et al 2022
“Human-Centered Mechanism Design With Democratic AI”, Koster et al 2022
“Hidden Agenda: a Social Deduction Game With Diverse Learned Equilibria”, Kopparapu et al 2022
Hidden Agenda: a Social Deduction Game with Diverse Learned Equilibria
“Finding General Equilibria in Many-Agent Economic Simulations Using Deep Reinforcement Learning”, Curry et al 2022
Finding General Equilibria in Many-Agent Economic Simulations Using Deep Reinforcement Learning
“Maximum Entropy Population Based Training for Zero-Shot Human-AI Coordination”, Zhao et al 2021
Maximum Entropy Population Based Training for Zero-Shot Human-AI Coordination
“Modeling Strong and Human-Like Gameplay With KL-Regularized Search”, Jacob et al 2021
Modeling Strong and Human-Like Gameplay with KL-Regularized Search
“Offline Pre-Trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks”, Meng et al 2021
Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks
“Player of Games”, Schmid et al 2021
“Collective Intelligence for Deep Learning: A Survey of Recent Developments”, Ha & Tang 2021
Collective Intelligence for Deep Learning: A Survey of Recent Developments
“Learning to Ground Multi-Agent Communication With Autoencoders”, Lin et al 2021
Learning to Ground Multi-Agent Communication with Autoencoders
“Meta-Learning, Social Cognition and Consciousness in Brains and Machines”, Langdon et al 2021
Meta-learning, social cognition and consciousness in brains and machines
“Collaborating With Humans without Human Data”, Strouse et al 2021
“The Neural MMO Platform for Massively Multiagent Research”, Suarez et al 2021
“Replay-Guided Adversarial Environment Design”, Jiang et al 2021
“DORA: No-Press Diplomacy from Scratch”, Bakhtin et al 2021
“Embodied Intelligence via Learning and Evolution”, Gupta et al 2021
“Trust Region Policy Optimization in Multi-Agent Reinforcement Learning”, Kuba et al 2021
Trust Region Policy Optimization in Multi-Agent Reinforcement Learning
“WarpDrive: Extremely Fast End-To-End Deep Multi-Agent Reinforcement Learning on a GPU”, Lan et al 2021
WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU
“The AI Economist: Optimal Economic Policy Design via Two-Level Deep Reinforcement Learning”, Zheng et al 2021
The AI Economist: Optimal Economic Policy Design via Two-level Deep Reinforcement Learning
“Open-Ended Learning Leads to Generally Capable Agents”, Team et al 2021
“Megaverse: Simulating Embodied Agents at One Million Experiences per Second”, Petrenko et al 2021
Megaverse: Simulating Embodied Agents at One Million Experiences per Second
“Scalable Evaluation of Multi-Agent Reinforcement Learning With Melting Pot”, Leibo et al 2021
Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot
“From Motor Control to Team Play in Simulated Humanoid Football”, Liu et al 2021
From Motor Control to Team Play in Simulated Humanoid Football
“Cooperative AI Foundation (CAIF)”, CAIF 2021
“Baller2vec++: A Look-Ahead Multi-Entity Transformer For Modeling Coordinated Agents”, Alcorn & Nguyen 2021
baller2vec++: A Look-Ahead Multi-Entity Transformer For Modeling Coordinated Agents
“Neural Tree Expansion for Multi-Robot Planning in Non-Cooperative Environments”, Riviere et al 2021
Neural Tree Expansion for Multi-Robot Planning in Non-Cooperative Environments
“Multitasking Inhibits Semantic Drift”, Jacob et al 2021
“Asymmetric Self-Play for Automatic Goal Discovery in Robotic Manipulation”, OpenAI et al 2021
Asymmetric self-play for automatic goal discovery in robotic manipulation
“Reinforcement Learning for Datacenter Congestion Control”, Tessler et al 2021
“Baller2vec: A Multi-Entity Transformer For Multi-Agent Spatiotemporal Modeling”, Alcorn & Nguyen 2021
baller2vec: A Multi-Entity Transformer For Multi-Agent Spatiotemporal Modeling
“UPDeT: Universal Multi-Agent Reinforcement Learning via Policy Decoupling With Transformers”, Hu et al 2021
UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers
“Imitating Interactive Intelligence”, Abramson et al 2020
“Towards Playing Full MOBA Games With Deep Reinforcement Learning”, Ye et al 2020
Towards Playing Full MOBA Games with Deep Reinforcement Learning
“TLeague: A Framework for Competitive Self-Play Based Distributed Multi-Agent Reinforcement Learning”, Sun et al 2020
TLeague: A Framework for Competitive Self-Play based Distributed Multi-Agent Reinforcement Learning
“Emergent Road Rules In Multi-Agent Driving Environments”, Pal et al 2020
“Reinforcement Learning for Optimization of COVID-19 Mitigation Policies”, Kompella et al 2020
Reinforcement Learning for Optimization of COVID-19 Mitigation policies
“Human-Level Performance in No-Press Diplomacy via Equilibrium Search”, Gray et al 2020
Human-Level Performance in No-Press Diplomacy via Equilibrium Search
“Emergent Social Learning via Multi-Agent Reinforcement Learning”, Ndousse et al 2020
Emergent Social Learning via Multi-agent Reinforcement Learning
“Grounded Language Learning Fast and Slow”, Hill et al 2020
“ReBeL: Combining Deep Reinforcement Learning and Search for Imperfect-Information Games”, Brown et al 2020
ReBeL: Combining Deep Reinforcement Learning and Search for Imperfect-Information Games
“Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions [Blog]”, Chang & Kaushik 2020
Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions [blog]
“One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control”, Huang et al 2020
One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control
“Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions”, Chang et al 2020
Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions
“Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks”, Papoudakis et al 2020
Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks
“Learning to Play No-Press Diplomacy With Best Response Policy Iteration”, Anthony et al 2020
Learning to Play No-Press Diplomacy with Best Response Policy Iteration
“Real World Games Look Like Spinning Tops”, Czarnecki et al 2020
“Approximate Exploitability: Learning a Best Response in Large Games”, Timbers et al 2020
Approximate exploitability: Learning a best response in large games
“Enhanced POET: Open-Ended Reinforcement Learning through Unbounded Invention of Learning Challenges and Their Solutions”, Wang et al 2020
“Social Diversity and Social Preferences in Mixed-Motive Reinforcement Learning”, McKee et al 2020
Social diversity and social preferences in mixed-motive reinforcement learning
“Effective Diversity in Population Based Reinforcement Learning”, Parker-Holder et al 2020
Effective Diversity in Population Based Reinforcement Learning
“Towards Learning Multi-Agent Negotiations via Self-Play”, Tang 2020
“Smooth Markets: A Basic Mechanism for Organizing Gradient-Based Learners”, Balduzzi et al 2020
Smooth markets: A basic mechanism for organizing gradient-based learners
“MicrobatchGAN: Stimulating Diversity With Multi-Adversarial Discrimination”, Mordido et al 2020
microbatchGAN: Stimulating Diversity with Multi-Adversarial Discrimination
“Learning by Cheating”, Chen et al 2019
“Increasing Generality in Machine Learning through Procedural Content Generation”, Risi & Togelius 2019
Increasing Generality in Machine Learning through Procedural Content Generation
“Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms”, Zhang et al 2019
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms
“Grandmaster Level in StarCraft II Using Multi-Agent Reinforcement Learning”, Vinyals et al 2019
Grandmaster level in StarCraft II using multi-agent reinforcement learning
“Multiplayer AlphaZero”, Petosa & Balch 2019
“Stabilizing Generative Adversarial Networks: A Survey”, Wiatrak et al 2019
“Emergent Tool Use From Multi-Agent Autocurricula”, Baker et al 2019
“Emergent Tool Use from Multi-Agent Interaction § Surprising Behavior”, Baker et al 2019
Emergent Tool Use from Multi-Agent Interaction § Surprising behavior
“Emergent Tool Use from Multi-Agent Interaction § Surprising Behavior”, Baker et al 2019
Emergent Tool Use from Multi-Agent Interaction § Surprising behavior
“No Press Diplomacy: Modeling Multi-Agent Gameplay”, Paquette et al 2019
“A Review of Cooperative Multi-Agent Deep Reinforcement Learning”, OroojlooyJadid & Hajinezhad 2019
A Review of Cooperative Multi-Agent Deep Reinforcement Learning
“Pluribus: Superhuman AI for Multiplayer Poker”, Brown & Sandholm 2019
“Evolving the Hearthstone Meta”, Silva et al 2019
“Evolutionary Implementation of Bayesian Computations”, Czégel et al 2019
“Finding Friend and Foe in Multi-Agent Games”, Serrino et al 2019
“Hierarchical Decision Making by Generating and Following Natural Language Instructions”, Hu et al 2019
Hierarchical Decision Making by Generating and Following Natural Language Instructions
“ICML 2019 Notes”, Abel 2019
“Human-Level Performance in 3D Multiplayer Games With Population-Based Reinforcement Learning”, Jaderberg et al 2019
Human-level performance in 3D multiplayer games with population-based reinforcement learning
“AI-GAs: AI-Generating Algorithms, an Alternate Paradigm for Producing General Artificial Intelligence”, Clune 2019
“Adversarial Policies: Attacking Deep Reinforcement Learning”, Gleave et al 2019
“LIGHT: Learning to Speak and Act in a Fantasy Text Adventure Game”, Urbanek et al 2019
LIGHT: Learning to Speak and Act in a Fantasy Text Adventure Game
“Α-Rank: Multi-Agent Evaluation by Evolution”, Omidshafiei et al 2019
“Autocurricula and the Emergence of Innovation from Social Interaction: A Manifesto for Multi-Agent Intelligence Research”, Leibo et al 2019
“Distilling Policy Distillation”, Czarnecki et al 2019
“Hierarchical Reinforcement Learning for Multi-Agent MOBA Game”, Zhang et al 2019
Hierarchical Reinforcement Learning for Multi-agent MOBA Game
“Open-Ended Learning in Symmetric Zero-Sum Games”, Balduzzi et al 2019
“Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions”, Wang et al 2019
“Hierarchical Macro Strategy Model for MOBA Game AI”, Wu et al 2018
“Continual Match Based Training in Pommerman: Technical Report”, Peng et al 2018
Continual Match Based Training in Pommerman: Technical Report
“Malthusian Reinforcement Learning”, Leibo et al 2018
“Stable Opponent Shaping in Differentiable Games”, Letcher et al 2018
“Deep Counterfactual Regret Minimization”, Brown et al 2018
“TarMAC: Targeted Multi-Agent Communication”, Das et al 2018
“Graph Convolutional Reinforcement Learning”, Jiang et al 2018
“Social Influence As Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning”, Jaques et al 2018
Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning
“Deep Reinforcement Learning”, Li 2018
“A Survey and Critique of Multiagent Deep Reinforcement Learning”, Hernandez-Leal et al 2018
A Survey and Critique of Multiagent Deep Reinforcement Learning
“Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation”, Nogueira et al 2018
Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation
“Pommerman: A Multi-Agent Playground”, Resnick et al 2018
“Fully Distributed Multi-Robot Collision Avoidance via Deep Reinforcement Learning for Safe and Efficient Navigation in Complex Scenarios”, Fan et al 2018
“Human-Level Performance in First-Person Multiplayer Games With Population-Based Deep Reinforcement Learning”, Jaderberg et al 2018
“Construction of Arbitrarily Strong Amplifiers of Natural Selection Using Evolutionary Graph Theory”, Pavlogiannis et al 2018
Construction of arbitrarily strong amplifiers of natural selection using evolutionary graph theory
“Adaptive Mechanism Design: Learning to Promote Cooperation”, Baumann et al 2018
“Mix&Match—Agent Curricula for Reinforcement Learning”, Czarnecki et al 2018
“Kickstarting Deep Reinforcement Learning”, Schmitt et al 2018
“Machine Theory of Mind”, Rabinowitz et al 2018
“Sim-To-Real Optimization of Complex Real World Mobile Network With Imperfect Information via Deep Reinforcement Learning from Self-Play”, Tan et al 2018
“Trust-Aware Decision Making for Human-Robot Collaboration: Model Learning and Planning”, Chen et al 2018
Trust-Aware Decision Making for Human-Robot Collaboration: Model Learning and Planning
“Emergent Complexity via Multi-Agent Competition”, Bansal et al 2017
“Learning With Opponent-Learning Awareness”, Foerster et al 2017
“LADDER: A Human-Level Bidding Agent for Large-Scale Real-Time Online Auctions”, Wang et al 2017
LADDER: A Human-Level Bidding Agent for Large-Scale Real-Time Online Auctions
“CAN: Creative Adversarial Networks, Generating "Art" by Learning About Styles and Deviating from Style Norms”, Elgammal et al 2017
“On Convergence and Stability of GANs”, Kodali et al 2017
“Supervision via Competition: Robot Adversaries for Learning Tasks”, Pinto et al 2016
Supervision via Competition: Robot Adversaries for Learning Tasks
“Policy Distillation”, Rusu et al 2015
“Reflective Oracles: A Foundation for Classical Game Theory”, Fallenstein et al 2015
“Homo Moralis-Preference Evolution Under Incomplete Information and Assortative Matching”, Alger & Weibull 2013
Homo Moralis-Preference Evolution Under Incomplete Information and Assortative Matching
“A Self-Coordinating Bus Route to Resist Bus Bunching”, III & Eisenstein 2012
“Language Evolution in the Laboratory”, Scott-Phillips & Kirby 2010
“If Multi-Agent Learning Is the Answer, What Is the Question?”, Shoham et al 2007
If multi-agent learning is the answer, what is the question?
“Market-Based Reinforcement Learning in Partially Observable Worlds”, Kwee et al 2001
Market-Based Reinforcement Learning in Partially Observable Worlds
“Properties of the Bucket Brigade Algorithm”, Holland 1985
“Computer-Aided Gas Pipeline Operation Using Genetic Algorithms And Rule Learning”, Goldberg 1983
Computer-Aided Gas Pipeline Operation Using Genetic Algorithms And Rule Learning
“Collaborating With Humans Requires Understanding Them”
Collaborating with Humans Requires Understanding Them:
View External Link:
“The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games”
The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games:
View External Link:
“Efficient Large-Scale Fleet Management via Multi-Agent Deep Reinforcement Learning”
Efficient large-scale fleet management via multi-agent deep reinforcement learning:
“Generally Capable Agents Emerge from Open-Ended Play”
“One Writer Enters International Competition to Play the World-Conquering Game That Redefines What It Means to Be a Geek (and a Person)”
“Mimicking Evolution With Reinforcement Learning”
“LLM Powered Autonomous Agents”
“The Pommerman Team Competition Or: How We Learned to Stop Worrying and Love the Battle”
The Pommerman team competition or: How we learned to stop worrying and love the battle
“How DeepMind's Generally Capable Agents Were Trained”
“How Much Compute Was Used to Train DeepMind's Generally Capable Agents?”
How much compute was used to train DeepMind's generally capable agents?:
“DeepMind: Generally Capable Agents Emerge from Open-Ended Play”
DeepMind: Generally capable agents emerge from open-ended play:
“So Has AI Conquered Bridge?”
View External Link:
https://www.lesswrong.com/posts/yHxmJch8dJoH6dwwz/so-has-ai-conquered-bridge
“The Steely, Headless King of Texas Hold ’Em”
“Artificial Intelligence Beats Eight World Champions at Bridge”
Artificial intelligence beats eight world champions at bridge
“Open-Ended Learning Leads to Generally Capable Agents”
Sort By Magic
Annotations sorted by machine learning into inferred 'tags'. This provides an alternative way to browse: instead of by date order, one can browse in topic order. The 'sorted' list has been automatically clustered into multiple sections & auto-labeled for easier browsing.
Beginning with the newest annotation, it uses the embedding of each annotation to attempt to create a list of nearest-neighbor annotations, creating a progression of topics. For more details, see the link.
cooperative-ai
deception-cooperation agile-soccer multi-agent-training auction-optimization policy-alignment
game-theory
emergent-behavior
Wikipedia
Miscellaneous
-
/doc/reinforcement-learning/multi-agent/2019-jaderberg-supplement-movie-1-aau6249s1.mp4
: -
/doc/reinforcement-learning/multi-agent/2019-jaderberg-supplement-movie-2-aau6249s2.mp4
: -
/doc/reinforcement-learning/multi-agent/2019-jaderberg-supplement-movie-3-aau6249s3.mp4
: -
/doc/reinforcement-learning/multi-agent/2019-jaderberg-supplement-movie-4-aau6249s4.mp4
: -
https://blog.otoro.net/2022/10/01/collectiveintelligence/
:View External Link:
-
https://research.google/blog/leveraging-machine-learning-for-game-development/
-
https://www.lesswrong.com/posts/65qmEJHDw3vw69tKm/proposal-scaling-laws-for-rl-generalization
: -
https://www.lesswrong.com/posts/FbSAuJfCxizZGpcHc/interpreting-the-learning-of-deceit
: -
https://www.quantamagazine.org/computers-evolve-a-new-path-toward-human-intelligence-20191106/
Bibliography
-
https://arxiv.org/abs/2312.08926
: “PRER: Modeling Complex Mathematical Reasoning via Large Language Model Based MathAgent”, -
https://www.nature.com/articles/s41467-023-42875-2#deepmind
: “Learning Few-Shot Imitation As Cultural Transmission”, -
https://arxiv.org/abs/2311.10090
: “JaxMARL: Multi-Agent RL Environments in JAX”, -
https://arxiv.org/abs/2311.03736
: “Neural MMO 2.0: A Massively Multi-Task Addition to Massively Multi-Agent Learning”, -
https://arxiv.org/abs/2308.09175#deepmind
: “Diversifying AI: Towards Creative Chess With AlphaZero (AZdb)”, -
https://arxiv.org/abs/2308.01404
: “Hoodwinked: Deception and Cooperation in a Text-Based Game for Language Models”, -
https://www.nber.org/papers/w31422
: “Combining Human Expertise With Artificial Intelligence: Experimental Evidence from Radiology”, -
https://arxiv.org/abs/2304.13653#deepmind
: “Learning Agile Soccer Skills for a Bipedal Robot With Deep Reinforcement Learning”, -
2022-bakhtin.pdf
: “CICERO: Human-Level Play in the Game of Diplomacy by Combining Language Models With Strategic Reasoning”, -
https://openreview.net/forum?id=DY1pMrmDkm
: “Modeling Bounded Rationality in Multi-Agent Simulations Using Rationally Inattentive Reinforcement Learning”, -
https://arxiv.org/abs/2208.04024
: “Social Simulacra: Creating Populated Prototypes for Social Computing Systems”, -
https://arxiv.org/abs/2206.15378#deepmind
: “DeepNash: Mastering the Game of Stratego With Model-Free Multiagent Reinforcement Learning”, -
https://arxiv.org/abs/2206.14349
: “Fleet-DAgger: Interactive Robot Fleet Learning With Scalable Human Supervision”, -
https://arxiv.org/abs/2206.07505
: “Revisiting Some Common Practices in Cooperative Multi-Agent Reinforcement Learning”, -
https://arxiv.org/abs/2205.14953
: “MAT: Multi-Agent Reinforcement Learning Is a Sequence Modeling Problem”, -
https://arxiv.org/abs/2202.07415#deepmind
: “NeuPL: Neural Population Learning”, -
https://arxiv.org/abs/2112.11701#tencent
: “Maximum Entropy Population Based Training for Zero-Shot Human-AI Coordination”, -
https://arxiv.org/abs/2112.03178#deepmind
: “Player of Games”, -
https://arxiv.org/abs/2105.12196#deepmind
: “From Motor Control to Team Play in Simulated Humanoid Football”, -
https://arxiv.org/abs/2104.11980
: “Baller2vec++: A Look-Ahead Multi-Entity Transformer For Modeling Coordinated Agents”, -
https://arxiv.org/abs/2012.05672#deepmind
: “Imitating Interactive Intelligence”, -
https://arxiv.org/abs/2011.12692#tencent
: “Towards Playing Full MOBA Games With Deep Reinforcement Learning”, -
https://arxiv.org/abs/2011.12895#tencent
: “TLeague: A Framework for Competitive Self-Play Based Distributed Multi-Agent Reinforcement Learning”, -
https://bair.berkeley.edu/blog/2020/07/11/auction/
: “Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions [Blog]”, -
2019-vinyals.pdf#deepmind
: “Grandmaster Level in StarCraft II Using Multi-Agent Reinforcement Learning”, -
https://openai.com/research/emergent-tool-use#surprisingbehaviors
: “Emergent Tool Use from Multi-Agent Interaction § Surprising Behavior”, -
https://david-abel.github.io/notes/icml_2019.pdf
: “ICML 2019 Notes”, -
2019-jaderberg.pdf#deepmind
: “Human-Level Performance in 3D Multiplayer Games With Population-Based Reinforcement Learning”, -
https://arxiv.org/abs/1902.02186#deepmind
: “Distilling Policy Distillation”, -
https://www.nature.com/articles/s42003-018-0078-7
: “Construction of Arbitrarily Strong Amplifiers of Natural Selection Using Evolutionary Graph Theory”, -
2013-alger.pdf
: “Homo Moralis-Preference Evolution Under Incomplete Information and Assortative Matching”, -
2007-shoham.pdf
: “If Multi-Agent Learning Is the Answer, What Is the Question?”,