“‘RL Exploration’ Tag”,2019-09-06 (; backlinks):
![]()
Bibliography for tag
reinforcement-learning/exploration, most recent first: 8 related tags, 312 annotations, & 40 links (parent).
- See Also
- Gwern
- Links
- “SimpleStrat: Diversifying Language Model Generation With Stratification”, et al 2024
- “Intelligent Go-Explore (IGE): Standing on the Shoulders of Giant Foundation Models”, et al 2024
- “Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts”, et al 2024
- “Self-Supervised Behavior Cloned Transformers Are Path Crawlers for Text Games”, 2023
- “Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations”, et al 2023
- “QDAIF: Quality-Diversity through AI Feedback”, et al 2023
- “Beyond Memorization: Violating Privacy Via Inference With Large Language Models”, et al 2023
- “Let Models Speak Ciphers: Multiagent Debate through Embeddings”, et al 2023
- “Small Batch Deep Reinforcement Learning”, Obando- et al 2023
- “Language Reward Modulation for Pretraining Reinforcement Learning”, et al 2023
- “Diversifying AI: Towards Creative Chess With AlphaZero (AZdb)”, et al 2023
- “Supervised Pretraining Can Learn In-Context Reinforcement Learning”, et al 2023
- “Learning to Generate Novel Scientific Directions With Contextualized Literature-Based Discovery”, et al 2023
- “You And Your Research”, 2023
- “Long-Term Value of Exploration: Measurements, Findings and Algorithms”, et al 2023
- “Inducing Anxiety in GPT-3.5 Increases Exploration and Bias”, Coda- et al 2023
- “Reflexion: Language Agents With Verbal Reinforcement Learning”, et al 2023
- “MimicPlay: Long-Horizon Imitation Learning by Watching Human Play”, et al 2023
- “MarioGPT: Open-Ended Text2Level Generation through Large Language Models”, et al 2023
- “DreamerV3: Mastering Diverse Domains through World Models”, et al 2023
- “AlphaZe∗∗: AlphaZero-Like Baselines for Imperfect Information Games Are Surprisingly Strong”, et al 2023
- “Effect of Lysergic Acid Diethylamide (LSD) on Reinforcement Learning in Humans”, et al 2022
- “Curiosity in Hindsight”, et al 2022
- “In-Context Reinforcement Learning With Algorithm Distillation”, et al 2022
- “E3B: Exploration via Elliptical Episodic Bonuses”, et al 2022
- “Vote-K: Selective Annotation Makes Language Models Better Few-Shot Learners”, et al 2022
- “LGE: Cell-Free Latent Go-Explore”, Gallouédec & 2022
- “A Provably Efficient Model-Free Posterior Sampling Method for Episodic Reinforcement Learning”, et al 2022
- “Trajectory Autoencoding Planner: Efficient Planning in a Compact Latent Action Space”, et al 2022
- “Value-Free Random Exploration Is Linked to Impulsivity”, 2022
- “Transformer Neural Processes: Uncertainty-Aware Meta Learning Via Sequence Modeling”, 2022
- “The Cost of Information Acquisition by Natural Selection”, et al 2022
- “Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos”, et al 2022
- “BYOL-Explore: Exploration by Bootstrapped Prediction”, et al 2022
- “Multi-Objective Hyperparameter Optimization—An Overview”, et al 2022
- “Director: Deep Hierarchical Planning from Pixels”, et al 2022
- “Boosting Search Engines With Interactive Agents”, et al 2022
- “Towards Learning Universal Hyperparameter Optimizers With Transformers”, et al 2022
- “Cliff Diving: Exploring Reward Surfaces in Reinforcement Learning Environments”, et al 2022
- “Effective Mutation Rate Adaptation through Group Elite Selection”, et al 2022
- “Semantic Exploration from Language Abstractions and Pretrained Representations”, et al 2022
- “Habitat-Web: Learning Embodied Object-Search Strategies from Human Demonstrations at Scale”, et al 2022
- “CLIP on Wheels (CoW): Zero-Shot Object Navigation As Object Localization and Exploration”, et al 2022
- “Policy Improvement by Planning With Gumbel”, et al 2022
- “Evolving Curricula With Regret-Based Environment Design”, Parker- et al 2022
- “VAPO: Affordance Learning from Play for Sample-Efficient Policy Learning”, Borja- et al 2022
- “Learning Causal Overhypotheses through Exploration in Children and Computational Models”, et al 2022
- “Policy Learning and Evaluation With Randomized Quasi-Monte Carlo”, et al 2022
- “NeuPL: Neural Population Learning”, et al 2022
- “ODT: Online Decision Transformer”, et al 2022
- “EvoJAX: Hardware-Accelerated Neuroevolution”, et al 2022
- “LID: Pre-Trained Language Models for Interactive Decision-Making”, et al 2022
- “Accelerated Quality-Diversity for Robotics through Massive Parallelism”, et al 2022
- “Rotting Infinitely Many-Armed Bandits”, et al 2022
- “Don’t Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement Learning (ExORL)”, et al 2022
- “Any-Play: An Intrinsic Augmentation for Zero-Shot Coordination”, 2022
- “Evolution Gym: A Large-Scale Benchmark for Evolving Soft Robots”, et al 2022
- “Environment Generation for Zero-Shot Compositional Reinforcement Learning”, et al 2022
- “Safe Deep RL in 3D Environments Using Human Feedback”, et al 2022
- “Automated Reinforcement Learning (AutoRL): A Survey and Open Problems”, Parker- et al 2022
- “Maximum Entropy Population Based Training for Zero-Shot Human-AI Coordination”, et al 2021
- “The Costs and Benefits of Dispersal in Small Populations”, 2021
- “The Geometry of Decision-Making in Individuals and Collectives”, et al 2021
- “An Experimental Design Perspective on Model-Based Reinforcement Learning”, et al 2021
- “JueWu-MC: Playing Minecraft With Sample-Efficient Hierarchical Reinforcement Learning”, et al 2021
- “Procedural Generalization by Planning With Self-Supervised World Models”, et al 2021
- “Correspondence between Neuroevolution and Gradient Descent”, et al 2021
- “URLB: Unsupervised Reinforcement Learning Benchmark”, et al 2021
- “Mastering Atari Games With Limited Data”, et al 2021
- “Discovering and Achieving Goals via World Models”, et al 2021
- “The Structure of Genotype-Phenotype Maps Makes Fitness Landscapes Navigable”, et al 2021
- “Replay-Guided Adversarial Environment Design”, et al 2021
- “A Review of the Gumbel-Max Trick and Its Extensions for Discrete Stochasticity in Machine Learning”, et al 2021
- “Monkey Plays Pac-Man With Compositional Strategies and Hierarchical Decision-Making”, et al 2021
- “Neural Autopilot and Context-Sensitivity of Habits”, 2021
- “Algorithmic Balancing of Familiarity, Similarity, & Discovery in Music Recommendations”, 2021
- “TrufLL: Learning Natural Language Generation from Scratch”, et al 2021
- “Is Curiosity All You Need? On the Utility of Emergent Behaviors from Curious Exploration”, et al 2021
- “Bootstrapped Meta-Learning”, et al 2021
- “Open-Ended Learning Leads to Generally Capable Agents”, et al 2021
- “Learning a Large Neighborhood Search Algorithm for Mixed Integer Programs”, et al 2021
- “Why Generalization in RL Is Difficult: Epistemic POMDPs and Implicit Partial Observability”, et al 2021
- “Imitation-Driven Cultural Collapse”, Duran-2021
- “Multi-Task Curriculum Learning in a Complex, Visual, Hard-Exploration Domain: Minecraft”, et al 2021
- “Learning to Hesitate”, et al 2021
- “Planning for Novelty: Width-Based Algorithms for Common Problems in Control, Planning and Reinforcement Learning”, 2021
- “Trajectory Transformer: Reinforcement Learning As One Big Sequence Modeling Problem”, et al 2021
- “From Motor Control to Team Play in Simulated Humanoid Football”, et al 2021
- “Reward Is Enough”, et al 2021
- “Principled Exploration via Optimistic Bootstrapping and Backward Induction”, et al 2021
- “Intelligence and Unambitiousness Using Algorithmic Information Theory”, et al 2021
- “Deep Bandits Show-Off: Simple and Efficient Exploration With Deep Networks”, 2021
- “On Lottery Tickets and Minimal Task Representations in Deep Reinforcement Learning”, et al 2021
- “What Are Bayesian Neural Network Posteriors Really Like?”, et al 2021
- “Epistemic Autonomy: Self-Supervised Learning in the Mammalian Hippocampus”, Santos- et al 2021
- “Bayesian Optimization Is Superior to Random Search for Machine Learning Hyperparameter Tuning: Analysis of the Black-Box Optimization Challenge 2020”, et al 2021
- “Flexible Modulation of Sequence Generation in the Entorhinal-Hippocampal System”, et al 2021
- “Reinforcement Learning, Bit by Bit”, et al 2021
- “Asymmetric Self-Play for Automatic Goal Discovery in Robotic Manipulation”, OpenAI et al 2021
- “Informational Herding, Optimal Experimentation, and Contrarianism”, et al 2021
- “Go-Explore: First Return, Then Explore”, et al 2021
- “TacticZero: Learning to Prove Theorems from Scratch With Deep Reinforcement Learning”, et al 2021
- “Proof Artifact Co-Training for Theorem Proving With Language Models”, et al 2021
- “The MineRL 2020 Competition on Sample Efficient Reinforcement Learning Using Human Priors”, et al 2021
- “Curriculum Learning: A Survey”, et al 2021
- “MAP-Elites Enables Powerful Stepping Stones and Diversity for Modular Robotics”, et al 2021
- “Is Pessimism Provably Efficient for Offline RL?”, et al 2020
- “Monte-Carlo Graph Search for AlphaZero”, et al 2020
- “Imitating Interactive Intelligence”, et al 2020
- “Emergent Complexity and Zero-Shot Transfer via Unsupervised Environment Design”, et al 2020
- “Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian”, Parker- et al 2020
- “Meta-Trained Agents Implement Bayes-Optimal Agents”, et al 2020
- “Learning Not to Learn: Nature versus Nurture in Silico”, 2020
- “The Child As Hacker”, et al 2020
- “Assessing Game Balance With AlphaZero: Exploring Alternative Rule Sets in Chess”, et al 2020
- “The Temporal Dynamics of Opportunity Costs: A Normative Account of Cognitive Fatigue and Boredom”, et al 2020
- “The Overfitted Brain: Dreams Evolved to Assist Generalization”, 2020
- “Exploration Strategies in Deep Reinforcement Learning”, 2020
- “Synthetic Petri Dish: A Novel Surrogate Model for Rapid Architecture Search”, et al 2020
- “Automatic Discovery of Interpretable Planning Strategies”, et al 2020
- “IJON: Exploring Deep State Spaces via Fuzzing”, et al 2020
- “Planning to Explore via Self-Supervised World Models”, et al 2020
- “Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems”, et al 2020
- “Pitfalls of Learning a Reward Function Online”, et al 2020
- “First Return, Then Explore”, et al 2020
- “Real World Games Look Like Spinning Tops”, et al 2020
- “Approximate Exploitability: Learning a Best Response in Large Games”, et al 2020
- “Agent57: Outperforming the Human Atari Benchmark”, et al 2020
- “Agent57: Outperforming the Atari Human Benchmark”, et al 2020
- “Enhanced POET: Open-Ended Reinforcement Learning through Unbounded Invention of Learning Challenges and Their Solutions”, et al 2020
- “Meta-Learning Curiosity Algorithms”, et al 2020
- “Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey”, et al 2020
- “AutoML-Zero: Evolving Machine Learning Algorithms From Scratch”, et al 2020
- “Never Give Up: Learning Directed Exploration Strategies”, et al 2020
- “Effective Diversity in Population Based Reinforcement Learning”, Parker- et al 2020
- “Near-Perfect Point-Goal Navigation from 2.5 Billion Frames of Experience”, 2020
- “MicrobatchGAN: Stimulating Diversity With Multi-Adversarial Discrimination”, et al 2020
- “Learning Human Objectives by Evaluating Hypothetical Behavior”, et al 2019
- “Optimal Policies Tend to Seek Power”, et al 2019
- “DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames”, et al 2019
- “Emergent Tool Use From Multi-Agent Autocurricula”, et al 2019
- “Emergent Tool Use from Multi-Agent Interaction § Surprising Behavior”, et al 2019
- “Emergent Tool Use from Multi-Agent Interaction § Surprising Behavior”, et al 2019
- “R2D3: Making Efficient Use of Demonstrations to Solve Hard Exploration Problems”, et al 2019
- “Benchmarking Bonus-Based Exploration Methods on the Arcade Learning Environment”, et al 2019
- “A Unified Bellman Optimality Principle Combining Reward Maximization and Empowerment”, et al 2019
- “An Optimistic Perspective on Offline Reinforcement Learning”, et al 2019
- “Meta Reinforcement Learning”, 2019
- “Search on the Replay Buffer: Bridging Planning and Reinforcement Learning”, et al 2019
- “ICML 2019 Notes”, 2019
- “Human-Level Performance in 3D Multiplayer Games With Population-Based Reinforcement Learning”, et al 2019
- “AI-GAs: AI-Generating Algorithms, an Alternate Paradigm for Producing General Artificial Intelligence”, 2019
- “Learning to Reason in Large Theories without Imitation”, et al 2019
- “Reinforcement Learning, Fast and Slow”, et al 2019
- “Meta Reinforcement Learning As Task Inference”, et al 2019
- “Meta-Learning of Sequential Strategies”, et al 2019
- “The MineRL 2019 Competition on Sample Efficient Reinforcement Learning Using Human Priors”, et al 2019
- “Π-IW: Deep Policies for Width-Based Planning in Pixel Domains”, et al 2019
- “Learning To Follow Directions in Street View”, et al 2019
- “A Generalized Framework for Population Based Training”, et al 2019
- “Go-Explore: a New Approach for Hard-Exploration Problems”, et al 2019
- “Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions”, et al 2019
- “Is the FDA Too Conservative or Too Aggressive?: A Bayesian Decision Analysis of Clinical Trial Design”, et al 2019
- “V-Fuzz: Vulnerability-Oriented Evolutionary Fuzzing”, et al 2019
- “Common Neural Code for Reward and Information Value”, 2019
- “Machine-Learning-Guided Directed Evolution for Protein Engineering”, et al 2019
- “Enjoy It Again: Repeat Experiences Are Less Repetitive Than People Think”, 2019
- “Evolutionary-Neural Hybrid Agents for Architecture Search”, et al 2018
- “The Bayesian Superorganism III: Externalized Memories Facilitate Distributed Sampling”, et al 2018
- “Exploration in the Wild”, et al 2018
- “Off-Policy Deep Reinforcement Learning without Exploration”, et al 2018
- “An Introduction to Deep Reinforcement Learning”, Francois- et al 2018
- “The Bayesian Superorganism I: Collective Probability Estimation”, et al 2018
- “Exploration by Random Network Distillation”, et al 2018
- “Computational Noise in Reward-Guided Learning Drives Behavioral Variability in Volatile Environments”, et al 2018
- “RND: Large-Scale Study of Curiosity-Driven Learning”, et al 2018
- “Visual Reinforcement Learning With Imagined Goals”, et al 2018
- “Is Q-Learning Provably Efficient?”, et al 2018
- “Improving Width-Based Planning With Compact Policies”, et al 2018
- “Construction of Arbitrarily Strong Amplifiers of Natural Selection Using Evolutionary Graph Theory”, et al 2018
- “Re-Evaluating Evaluation”, et al 2018
- “DVRL: Deep Variational Reinforcement Learning for POMDPs”, et al 2018
- “Mix&Match—Agent Curricula for Reinforcement Learning”, et al 2018
- “Playing Hard Exploration Games by Watching YouTube”, et al 2018
- “Observe and Look Further: Achieving Consistent Performance on Atari”, et al 2018
- “Generalization and Search in Risky Environments”, et al 2018
- “Toward Diverse Text Generation With Inverse Reinforcement Learning”, et al 2018
- “Efficient Multi-Objective Neural Architecture Search via Lamarckian Evolution”, et al 2018
- “Learning to Navigate in Cities Without a Map”, et al 2018
- “The Surprising Creativity of Digital Evolution: A Collection of Anecdotes from the Evolutionary Computation and Artificial Life Research Communities”, et al 2018
- “Some Considerations on Learning to Explore via Meta-Reinforcement Learning”, et al 2018
- “Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling”, et al 2018
- “Reinforcement Learning on Web Interfaces Using Workflow-Guided Exploration”, et al 2018
- “Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari”, et al 2018
- “One Big Net For Everything”, 2018
- “Learning to Search With MCTSnets”, et al 2018
- “Learning and Querying Fast Generative Models for Reinforcement Learning”, et al 2018
- “Safe Exploration in Continuous Action Spaces”, et al 2018
- “Learning to Evade Static PE Machine Learning Malware Models via Reinforcement Learning”, et al 2018
- “Deep Reinforcement Fuzzing”, et al 2018
- “Planning Chemical Syntheses With Deep Neural Networks and Symbolic AI”, et al 2018
- “Generalization Guides Human Exploration in Vast Decision Spaces”, et al 2018
- “Innovation and Cumulative Culture through Tweaks and Leaps in Online Programming Contests”, et al 2018
- “A Flexible Approach to Automated RNN Architecture Generation”, et al 2017
- “Finding Competitive Network Architectures Within a Day Using UCT”, 2017
- “Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents”, et al 2017
- “Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning”, et al 2017
- “The Paradoxical Sustainability of Periodic Migration and Habitat Destruction”, 2017
- “Posterior Sampling for Large Scale Reinforcement Learning”, et al 2017
- “Policy Optimization by Genetic Distillation”, 2017
- “Emergent Complexity via Multi-Agent Competition”, et al 2017
- “An Analysis of the Value of Information When Exploring Stochastic, Discrete Multi-Armed Bandits”, 2017
- “The Uncertainty Bellman Equation and Exploration”, et al 2017
- “Changing Their Tune: How Consumers’ Adoption of Online Streaming Affects Music Consumption and Discovery”, et al 2017
- “A Rational Choice Framework for Collective Behavior”, 2017
- “Imagination-Augmented Agents for Deep Reinforcement Learning”, et al 2017
- “Distral: Robust Multitask Reinforcement Learning”, et al 2017
- “The Intentional Unintentional Agent: Learning to Solve Many Continuous Control Tasks Simultaneously”, et al 2017
- “Emergence of Locomotion Behaviors in Rich Environments”, et al 2017
- “Noisy Networks for Exploration”, et al 2017
- “CAN: Creative Adversarial Networks, Generating “Art” by Learning About Styles and Deviating from Style Norms”, et al 2017
- “Device Placement Optimization With Reinforcement Learning”, et al 2017
- “Towards Synthesizing Complex Programs from Input-Output Examples”, et al 2017
- “Scalable Generalized Linear Bandits: Online Computation and Hashing”, et al 2017
- “DeepXplore: Automated Whitebox Testing of Deep Learning Systems”, et al 2017
- “Recurrent Environment Simulators”, et al 2017
- “Learned Optimizers That Scale and Generalize”, et al 2017
- “Evolution Strategies As a Scalable Alternative to Reinforcement Learning”, et al 2017
- “Large-Scale Evolution of Image Classifiers”, et al 2017
- “CoDeepNEAT: Evolving Deep Neural Networks”, et al 2017
- “Rotting Bandits”, et al 2017
- “Neural Combinatorial Optimization With Reinforcement Learning”, et al 2017
- “Neural Data Filter for Bootstrapping Stochastic Gradient Descent”, et al 2017
- “Search in Patchy Media: Exploitation-Exploration Tradeoff”
- “Towards Information-Seeking Agents”, et al 2016
- “Exploration and Exploitation of Victorian Science in Darwin’s Reading Notebooks”, et al 2016
- “Learning to Learn without Gradient Descent by Gradient Descent”, et al 2016
- “Learning to Perform Physics Experiments via Deep Reinforcement Learning”, et al 2016
- “Neural Architecture Search With Reinforcement Learning”, 2016
- “Combating Reinforcement Learning’s Sisyphean Curse With Intrinsic Fear”, et al 2016
- “Bayesian Reinforcement Learning: A Survey”, et al 2016
- “Human Collective Intelligence As Distributed Bayesian Inference”, et al 2016
- “Universal Darwinism As a Process of Bayesian Inference”, 2016
- “Unifying Count-Based Exploration and Intrinsic Motivation”, et al 2016
- “D-TS: Double Thompson Sampling for Dueling Bandits”, 2016
- “Improving Information Extraction by Acquiring External Evidence With Reinforcement Learning”, et al 2016
- “Deep Exploration via Bootstrapped DQN”, et al 2016
- “The Netflix Recommender System”, Gomez-2015
- “On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models”, 2015
- “Online Batch Selection for Faster Training of Neural Networks”, 2015
- “MAP-Elites: Illuminating Search Spaces by Mapping Elites”, 2015
- “What My Deep Model Doesn’t Know…”, 2015
- “The Psychology and Neuroscience of Curiosity”, 2015
- “Thompson Sampling With the Online Bootstrap”, 2014
- “On the Complexity of Best Arm Identification in Multi-Armed Bandit Models”, et al 2014
- “Robots That Can Adapt like Animals”, et al 2014
- “Freeze-Thaw Bayesian Optimization”, et al 2014
- “Search for the Wreckage of Air France Flight AF 447”, et al 2014
- “(More) Efficient Reinforcement Learning via Posterior Sampling”, et al 2013
- “Model-Based Bayesian Exploration”, et al 2013
- “PUCT: Continuous Upper Confidence Trees With Polynomial Exploration-Consistency”, et al 2013
- “(More) Efficient Reinforcement Learning via Posterior Sampling [PSRL]”, 2013
- “Experimental Design for Partially Observed Markov Decision Processes”, 2012
- “Learning Is Planning: near Bayes-Optimal Reinforcement Learning via Monte-Carlo Tree Search”, 2012
- “PILCO: A Model-Based and Data-Efficient Approach to Policy Search”, 2011
- “Abandoning Objectives: Evolution Through the Search for Novelty Alone”, 2011
- “Planning to Be Surprised: Optimal Bayesian Exploration in Dynamic Environments”, et al 2011
- “Age-Fitness Pareto Optimization”, 2010
- “Monte-Carlo Planning in Large POMDPs”, 2010
- “Formal Theory of Creativity & Fun & Intrinsic Motivation (1990–20201014ya)”, 2010
- “The Epistemic Benefit of Transient Diversity”, 2009
- “Specialization Effect and Its Influence on Memory and Problem Solving in Expert Chess Players”, et al 2009
- “Driven by Compression Progress: A Simple Principle Explains Essential Aspects of Subjective Beauty, Novelty, Surprise, Interestingness, Attention, Curiosity, Creativity, Art, Science, Music, Jokes”, 2008
- “Pure Exploration for Multi-Armed Bandit Problems”, et al 2008
- “Exploiting Open-Endedness to Solve Problems Through the Search for Novelty”, 2008
- “Towards Efficient Evolutionary Design of Autonomous Robots”, 2008
- “Resilient Machines Through Continuous Self-Modeling”, et al 2006
- “ALPS: the Age-Layered Population Structure for Reducing the Problem of Premature Convergence”, 2006
- “Bayesian Adaptive Exploration”, 2003
- “NEAT: Evolving Neural Networks through Augmenting Topologies”, 2002
- “A Bayesian Framework for Reinforcement Learning”, 2000
- “Case Studies in Evolutionary Experimentation and Computation”, 2000
- “Efficient Progressive Sampling”, et al 1999b
- “Evolving 3D Morphology and Behavior by Competition”, 1994
- “Interactions between Learning and Evolution”, 1992
- “Evolution Strategy: Nature’s Way of Optimization”, 1989
- “The Analysis of Sequential Experiments With Feedback to Subjects”, 1981
- “Evolutionsstrategien”, 1977
- Evolutionsstrategie: Optimierung Technischer Systeme Nach Prinzipien Der Biologischen Evolution, 1973
- “The Usefulness of Useless Knowledge”, 1939
- “Curiosity Killed the Mario”
- “Brian Christian on Computer Science Algorithms That Tackle Fundamental and Universal Problems”
- “Solving Zelda With the Antithesis SDK”
- “Goodhart’s Law, Diversity and a Series of Seemingly Unrelated Toy Problems”
- “Why Generalization in RL Is Difficult: Epistemic POMDPs and Implicit Partial Observability [Blog]”
- Bayesian Optimization Book
- “Temporal Difference Learning and TD-Gammon”
- “An Experimental Design Perspective on Model-Based Reinforcement Learning [Blog]”
- “Safety-First AI for Autonomous Data Center Cooling and Industrial Control”
- “Pulling JPEGs out of Thin Air”
- “Curriculum For Reinforcement Learning”
- “Why Testing Self-Driving Cars in SF Is Challenging but Necessary”
- “Reinforcement Learning With Prediction-Based Rewards”
- “Prompting Diverse Ideas: Increasing AI Idea Variance”
- “You Need a Novelty Budget”
- “ChatGPT As Muse, Not Oracle”, 2024
- “Conditions for Mathematical Equivalence of Stochastic Gradient Descent and Natural Selection”
- “Probable Points and Credible Intervals, Part 2: Decision Theory”
- “AI Is Learning How to Create Itself”
- “Montezuma’s Revenge Solved by Go-Explore, a New Algorithm for Hard-Exploration Problems (Sets Records on Pitfall, Too)”
- “Monkeys Play Pac-Man”
- “Playing Montezuma’s Revenge With Intrinsic Motivation”
- Sort By Magic
- Wikipedia
- Miscellaneous
- Bibliography