- See Also
-
Links
- “MimicPlay: Long-Horizon Imitation Learning by Watching Human Play”, Et Al 2023
- “DreamerV3: Mastering Diverse Domains through World Models”, Et Al 2023
- “Merging Enzymatic and Synthetic Chemistry With Computational Synthesis Planning”, Et Al 2022
- “PALMER: Perception-Action Loop With Memory for Long-Horizon Planning”, Et Al 2022
- “CICERO: Human-level Play in the Game Of Diplomacy By Combining Language Models With Strategic Reasoning”, Et Al 2022
- “Online Learning and Bandits With Queried Hints”, Et Al 2022
- “Creating a Dynamic Quadrupedal Robotic Goalkeeper With Reinforcement Learning”, Et Al 2022
- “Top-down Design of Protein Nanomaterials With Reinforcement Learning”, Et Al 2022
- “Simplifying Model-based RL: Learning Representations, Latent-space Models, and Policies With One Objective (ALM)”, Et Al 2022
- “LaTTe: Language Trajectory TransformEr”, Et Al 2022
- “Learning With Combinatorial Optimization Layers: a Probabilistic Approach”, Et Al 2022
- “PI-ARS: Accelerating Evolution-Learned Visual-Locomotion With Predictive Information Representations”, Et Al 2022
- “Spatial Representation by Ramping Activity of Neurons in the Retrohippocampal Cortex”, Et Al 2022
- “Inner Monologue: Embodied Reasoning through Planning With Language Models”, Et Al 2022
- “LM-Nav: Robotic Navigation With Large Pre-Trained Models of Language, Vision, and Action”, Et Al 2022
- “DayDreamer: World Models for Physical Robot Learning”, Et Al 2022
- “Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos”, Et Al 2022
- “BYOL-Explore: Exploration by Bootstrapped Prediction”, Et Al 2022
- “Director: Deep Hierarchical Planning from Pixels”, Et Al 2022
- “Flexible Diffusion Modeling of Long Videos”, Et Al 2022
- “Housekeep: Tidying Virtual Households Using Commonsense Reasoning”, Et Al 2022
- “Semantic Exploration from Language Abstractions and Pretrained Representations”, Et Al 2022
- “Demonstrate Once, Imitate Immediately (DOME): Learning Visual Servoing for One-Shot Imitation Learning”, Et Al 2022
- “Do As I Can, Not As I Say (SayCan): Grounding Language in Robotic Affordances”, Et Al 2022
- “Reinforcement Learning With Action-Free Pre-Training from Videos”, Et Al 2022
- “On-the-fly Strategy Adaptation for Ad-hoc Agent Coordination”, Et Al 2022
- “VAPO: Affordance Learning from Play for Sample-Efficient Policy Learning”, Borja-Et Al 2022
- “Learning Synthetic Environments and Reward Networks for Reinforcement Learning”, Et Al 2022
- “LID: Pre-Trained Language Models for Interactive Decision-Making”, Et Al 2022
- “How to Build a Cognitive Map: Insights from Models of the Hippocampal Formation”, Et Al 2022
- “Language Models As Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents”, Et Al 2022
- “What Is the Point of Computers? A Question for Pure Mathematicians”, 2021
- “An Experimental Design Perspective on Model-Based Reinforcement Learning”, Et Al 2021
- “Reinforcement Learning on Human Decision Models for Uniquely Collaborative AI Teammates”, 2021
- “Learning Representations for Pixel-based Control: What Matters and Why?”, Et Al 2021
- “Learning Behaviors through Physics-driven Latent Imagination”, Et Al 2021
- “Is Bang-Bang Control All You Need? Solving Continuous Control With Bernoulli Policies”, Et Al 2021
- “Skill Induction and Planning With Latent Language”, Et Al 2021
- “Brax—A Differentiable Physics Engine for Large Scale Rigid Body Simulation”, Et Al 2021
- “FitVid: Overfitting in Pixel-Level Video Prediction”, Et Al 2021
- “A Graph Placement Methodology for Fast Chip Design”, Et Al 2021
- “The Whole Prefrontal Cortex Is Premotor Cortex”, 2021
- “PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World”, Et Al 2021
- “Constructions in Combinatorics via Neural Networks”, 2021
- “Machine Translation Decoding beyond Beam Search”, Et Al 2021
- “Replaying Real Life: How the Waymo Driver Avoids Fatal Human Crashes”, 2021
- “Latent Imagination Facilitates Zero-Shot Transfer in Autonomous Racing”, Et Al 2021
- “Waymo Simulated Driving Behavior in Reconstructed Fatal Crashes within an Autonomous Vehicle Operating Domain”, Et Al 2021
- “COMBO: Conservative Offline Model-Based Policy Optimization”, Et Al 2021
- “Solving Mixed Integer Programs Using Neural Networks”, Et Al 2020
- “ViNG: Learning Open-World Navigation With Visual Goals”, Et Al 2020
- “Targeting for Long-term Outcomes”, Et Al 2020
- “A Time Leap Challenge for SAT Solving”, Et Al 2020
- “RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning”, Et Al 2020
- “Learning to Simulate Dynamic Environments With GameGAN”, Et Al 2020
- “Learning to Simulate Dynamic Environments With GameGAN [homepage]”, Et Al 2020
- “Reinforcement Learning With Augmented Data”, Et Al 2020
- “Learning to Fly via Deep Model-Based Reinforcement Learning”, Becker-Et Al 2020
- “Introducing Dreamer: Scalable Reinforcement Learning Using World Models”, 2020
- “Learning to Prove Theorems by Learning to Generate Theorems”, 2020
- “The Gambler’s Problem and Beyond”, Et Al 2019
- “Dream to Control: Learning Behaviors by Latent Imagination”, Et Al 2019
- “Approximate Inference in Discrete Distributions With Monte Carlo Tree Search and Value Functions”, Et Al 2019
- “Designing Agent Incentives to Avoid Reward Tampering”, Et Al 2019
- “An Application of Reinforcement Learning to Aerobatic Helicopter Flight”, Et Al 2019
- “When to Trust Your Model: Model-Based Policy Optimization (MOPO)”, Et Al 2019
- “Fast Task Inference With Variational Intrinsic Successor Features”, Et Al 2019
- “Bayesian Layers: A Module for Neural Network Uncertainty”, Et Al 2018
- “PlaNet: Learning Latent Dynamics for Planning from Pixels”, Et Al 2018
- “Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning”, Et Al 2018
- “Towards Automated Deep Learning: Efficient Joint Neural Architecture and Hyperparameter Search”, Et Al 2018
- “The Alignment Problem for Bayesian History-Based Reinforcement Learners”, 2018
- “Neural Scene Representation and Rendering”, Et Al 2018
- “Mining Gold from Implicit Models to Improve Likelihood-free Inference”, Et Al 2018
- “Deep Reinforcement Learning in a Handful of Trials Using Probabilistic Dynamics Models”, Et Al 2018
- “Learning to Optimize Tensor Programs”, Et Al 2018
- “Estimate and Replace: A Novel Approach to Integrating Deep Neural Networks With Existing Applications”, Et Al 2018
- “World Models”, 2018
- “Differentiable Dynamic Programming for Structured Prediction and Attention”, 2018
- “Generalization Guides Human Exploration in Vast Decision Spaces”, Et Al 2018
- “Planning Chemical Syntheses With Deep Neural Networks and Symbolic AI”, Et Al 2018
- “How to Explore Chemical Space Using Algorithms and Automation”, Et Al 2018
- “Safe Policy Search With Gaussian Process Models”, Et Al 2017
- “Analogical-based Bayesian Optimization”, Et Al 2017
- “A Game-Theoretic Analysis of the Off-Switch Game”, Et Al 2017
- “Neural Network Dynamics for Model-Based Deep Reinforcement Learning With Model-Free Fine-Tuning”, Et Al 2017
- “Learning Transferable Architectures for Scalable Image Recognition”, Et Al 2017
- “Learning Model-based Planning from Scratch”, Et Al 2017
- “Value Prediction Network”, Et Al 2017
- “Path Integral Networks: End-to-End Differentiable Optimal Control”, Et Al 2017
- “Visual Semantic Planning Using Deep Successor Representations”, Et Al 2017
- “AIXIjs: A Software Demo for General Reinforcement Learning”, 2017
- “DeepArchitect: Automatically Designing and Training Deep Architectures”, 2017
- “Stochastic Constraint Programming As Reinforcement Learning”, Et Al 2017
- “Recurrent Environment Simulators”, Et Al 2017
- “Prediction and Control With Temporal Segment Models”, Et Al 2017
- “The Kelly Coin-Flipping Game: Exact Solutions”, Et Al 2017
- “The Hippocampus As a Predictive Map”, Et Al 2017
- “The Predictron: End-To-End Learning and Planning”, Et Al 2016
- “Model-based Adversarial Imitation Learning”, Et Al 2016
- “DeepMath—Deep Sequence Models for Premise Selection”, Et Al 2016
- “Value Iteration Networks”, Et Al 2016
- “Resorting Media Ratings”, 2015
- “Compress and Control”, Et Al 2014
- “Learning to Win by Reading Manuals in a Monte-Carlo Framework”, Et Al 2014
- “Whatever Next? Predictive Brains, Situated Agents, and the Future of Cognitive Science”, 2013
- “PUCT: Continuous Upper Confidence Trees With Polynomial Exploration-Consistency”, Et Al 2013
- “Planning As Satisfiability: Heuristics”, 2012
- “A Monte Carlo AIXI Approximation”, Et Al 2009
- “Policy Mining: Learning Decision Policies from Fixed Sets of Data”, 2003
- “The Speed Prior: A New Simplicity Measure Yielding Near-Optimal Computable Predictions”, 2002
- “A Critique of Pure Reason”, 1987
- “Human Window on the World”, 1985
- Wikipedia
- Miscellaneous
- Link Bibliography
See Also
Links
“MimicPlay: Long-Horizon Imitation Learning by Watching Human Play”, Et Al 2023
“MimicPlay: Long-Horizon Imitation Learning by Watching Human Play”, 2023-02-24 ( ; similar)
“DreamerV3: Mastering Diverse Domains through World Models”, Et Al 2023
“DreamerV3: Mastering Diverse Domains through World Models”, 2023-01-10 ( ; similar; bibliography)
“Merging Enzymatic and Synthetic Chemistry With Computational Synthesis Planning”, Et Al 2022
“Merging enzymatic and synthetic chemistry with computational synthesis planning”, 2022-12-14 ( ; similar; bibliography)
“PALMER: Perception-Action Loop With Memory for Long-Horizon Planning”, Et Al 2022
“PALMER: Perception-Action Loop with Memory for Long-Horizon Planning”, 2022-12-08 (similar)
“CICERO: Human-level Play in the Game Of Diplomacy By Combining Language Models With Strategic Reasoning”, Et Al 2022
“CICERO: Human-level play in the game of Diplomacy by combining language models with strategic reasoning”, 2022-11-22 ( ; similar; bibliography)
“Online Learning and Bandits With Queried Hints”, Et Al 2022
“Online Learning and Bandits with Queried Hints”, 2022-11-04 (similar)
“Creating a Dynamic Quadrupedal Robotic Goalkeeper With Reinforcement Learning”, Et Al 2022
“Creating a Dynamic Quadrupedal Robotic Goalkeeper with Reinforcement Learning”, 2022-10-10 ( ; similar)
“Top-down Design of Protein Nanomaterials With Reinforcement Learning”, Et Al 2022
“Top-down design of protein nanomaterials with reinforcement learning”, 2022-09-25 ( ; similar)
“Simplifying Model-based RL: Learning Representations, Latent-space Models, and Policies With One Objective (ALM)”, Et Al 2022
“Simplifying Model-based RL: Learning Representations, Latent-space Models, and Policies with One Objective (ALM)”, 2022-09-18 ( ; similar; bibliography)
“LaTTe: Language Trajectory TransformEr”, Et Al 2022
“LaTTe: Language Trajectory TransformEr”, 2022-08-04 ( ; similar)
“Learning With Combinatorial Optimization Layers: a Probabilistic Approach”, Et Al 2022
“Learning with Combinatorial Optimization Layers: a Probabilistic Approach”, 2022-07-27 ( ; similar)
“PI-ARS: Accelerating Evolution-Learned Visual-Locomotion With Predictive Information Representations”, Et Al 2022
“PI-ARS: Accelerating Evolution-Learned Visual-Locomotion with Predictive Information Representations”, 2022-07-27 ( ; similar)
“Spatial Representation by Ramping Activity of Neurons in the Retrohippocampal Cortex”, Et Al 2022
“Spatial representation by ramping activity of neurons in the retrohippocampal cortex”, 2022-07-26 ( ; similar)
“Inner Monologue: Embodied Reasoning through Planning With Language Models”, Et Al 2022
“Inner Monologue: Embodied Reasoning through Planning with Language Models”, 2022-07-12 ( ; similar; bibliography)
“LM-Nav: Robotic Navigation With Large Pre-Trained Models of Language, Vision, and Action”, Et Al 2022
“LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action”, 2022-07-10 ( ; backlinks; similar; bibliography)
“DayDreamer: World Models for Physical Robot Learning”, Et Al 2022
“DayDreamer: World Models for Physical Robot Learning”, 2022-06-28 ( ; similar; bibliography)
“Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos”, Et Al 2022
“Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos”, 2022-06-23 ( ; similar; bibliography)
“BYOL-Explore: Exploration by Bootstrapped Prediction”, Et Al 2022
“BYOL-Explore: Exploration by Bootstrapped Prediction”, 2022-06-16 ( ; similar)
“Director: Deep Hierarchical Planning from Pixels”, Et Al 2022
“Director: Deep Hierarchical Planning from Pixels”, 2022-06-08 ( ; similar; bibliography)
“Flexible Diffusion Modeling of Long Videos”, Et Al 2022
“Flexible Diffusion Modeling of Long Videos”, 2022-05-23 ( ; similar)
“Housekeep: Tidying Virtual Households Using Commonsense Reasoning”, Et Al 2022
“Housekeep: Tidying Virtual Households using Commonsense Reasoning”, 2022-05-22 ( ; backlinks; similar)
“Semantic Exploration from Language Abstractions and Pretrained Representations”, Et Al 2022
“Semantic Exploration from Language Abstractions and Pretrained Representations”, 2022-04-08 ( ; similar; bibliography)
“Demonstrate Once, Imitate Immediately (DOME): Learning Visual Servoing for One-Shot Imitation Learning”, Et Al 2022
“Demonstrate Once, Imitate Immediately (DOME): Learning Visual Servoing for One-Shot Imitation Learning”, 2022-04-06 ( ; similar)
“Do As I Can, Not As I Say (SayCan): Grounding Language in Robotic Affordances”, Et Al 2022
“Do As I Can, Not As I Say (SayCan): Grounding Language in Robotic Affordances”, 2022-04-04 ( ; similar; bibliography)
“Reinforcement Learning With Action-Free Pre-Training from Videos”, Et Al 2022
“Reinforcement Learning with Action-Free Pre-Training from Videos”, 2022-03-25 ( ; similar)
“On-the-fly Strategy Adaptation for Ad-hoc Agent Coordination”, Et Al 2022
“On-the-fly Strategy Adaptation for ad-hoc Agent Coordination”, 2022-03-08 ( ; similar)
“VAPO: Affordance Learning from Play for Sample-Efficient Policy Learning”, Borja-Et Al 2022
“VAPO: Affordance Learning from Play for Sample-Efficient Policy Learning”, 2022-03-01 ( ; similar)
“Learning Synthetic Environments and Reward Networks for Reinforcement Learning”, Et Al 2022
“Learning Synthetic Environments and Reward Networks for Reinforcement Learning”, 2022-02-06 ( ; similar)
“LID: Pre-Trained Language Models for Interactive Decision-Making”, Et Al 2022
“LID: Pre-Trained Language Models for Interactive Decision-Making”, 2022-02-03 ( ; backlinks; similar)
“How to Build a Cognitive Map: Insights from Models of the Hippocampal Formation”, Et Al 2022
“How to build a cognitive map: insights from models of the hippocampal formation”, 2022-02-03 ( ; similar)
“Language Models As Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents”, Et Al 2022
“Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents”, 2022-01-18 ( ; similar)
“What Is the Point of Computers? A Question for Pure Mathematicians”, 2021
“What is the point of computers? A question for pure mathematicians”, 2021-12-22 ( )
“An Experimental Design Perspective on Model-Based Reinforcement Learning”, Et Al 2021
“An Experimental Design Perspective on Model-Based Reinforcement Learning”, 2021-12-09 ( ; similar)
“Reinforcement Learning on Human Decision Models for Uniquely Collaborative AI Teammates”, 2021
“Reinforcement Learning on Human Decision Models for Uniquely Collaborative AI Teammates”, 2021-11-18 ( ; similar)
“Learning Representations for Pixel-based Control: What Matters and Why?”, Et Al 2021
“Learning Representations for Pixel-based Control: What Matters and Why?”, 2021-11-15 (similar)
“Learning Behaviors through Physics-driven Latent Imagination”, Et Al 2021
“Learning Behaviors through Physics-driven Latent Imagination”, 2021-11-04 ( ; similar)
“Is Bang-Bang Control All You Need? Solving Continuous Control With Bernoulli Policies”, Et Al 2021
“Is Bang-Bang Control All You Need? Solving Continuous Control with Bernoulli Policies”, 2021-11-03 (similar)
“Brax—A Differentiable Physics Engine for Large Scale Rigid Body Simulation”, Et Al 2021
“Brax—A Differentiable Physics Engine for Large Scale Rigid Body Simulation”, 2021-06-24 ( ; similar; bibliography)
“FitVid: Overfitting in Pixel-Level Video Prediction”, Et Al 2021
“FitVid: Overfitting in Pixel-Level Video Prediction”, 2021-06-24 ( ; similar)
“A Graph Placement Methodology for Fast Chip Design”, Et Al 2021
“A graph placement methodology for fast chip design”, 2021-06-09
“The Whole Prefrontal Cortex Is Premotor Cortex”, 2021
“The whole prefrontal cortex is premotor cortex”, 2021-06-08 ( ; similar)
“PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World”, Et Al 2021
“PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World”, 2021-06-01 ( ; backlinks; similar)
“Constructions in Combinatorics via Neural Networks”, 2021
“Constructions in combinatorics via neural networks”, 2021-04-29 ( )
“Machine Translation Decoding beyond Beam Search”, Et Al 2021
“Machine Translation Decoding beyond Beam Search”, 2021-04-12 ( ; similar)
“Replaying Real Life: How the Waymo Driver Avoids Fatal Human Crashes”, 2021
“Replaying real life: how the Waymo Driver avoids fatal human crashes”, 2021-03-08 ( ; backlinks; similar; bibliography)
“Latent Imagination Facilitates Zero-Shot Transfer in Autonomous Racing”, Et Al 2021
“Latent Imagination Facilitates Zero-Shot Transfer in Autonomous Racing”, 2021-03-08 ( ; similar)
“Waymo Simulated Driving Behavior in Reconstructed Fatal Crashes within an Autonomous Vehicle Operating Domain”, Et Al 2021
“Waymo Simulated Driving Behavior in Reconstructed Fatal Crashes within an Autonomous Vehicle Operating Domain”, 2021-03-08 ( ; backlinks; similar)
“COMBO: Conservative Offline Model-Based Policy Optimization”, Et Al 2021
“COMBO: Conservative Offline Model-Based Policy Optimization”, 2021-02-16 (similar)
“Solving Mixed Integer Programs Using Neural Networks”, Et Al 2020
“Solving Mixed Integer Programs Using Neural Networks”, 2020-12-23 ( ; similar)
“ViNG: Learning Open-World Navigation With Visual Goals”, Et Al 2020
“ViNG: Learning Open-World Navigation with Visual Goals”, 2020-12-17 ( ; backlinks; similar)
“Targeting for Long-term Outcomes”, Et Al 2020
“Targeting for long-term outcomes”, 2020-10-29 ( ; similar)
“A Time Leap Challenge for SAT Solving”, Et Al 2020
“A Time Leap Challenge for SAT Solving”, 2020-08-05 ( ; backlinks)
“RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning”, Et Al 2020
“RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning”, 2020-06-24 (backlinks; similar)
“Learning to Simulate Dynamic Environments With GameGAN”, Et Al 2020
“Learning to Simulate Dynamic Environments with GameGAN”, 2020-05-25 ( ; similar)
“Learning to Simulate Dynamic Environments With GameGAN [homepage]”, Et Al 2020
“Learning to Simulate Dynamic Environments with GameGAN [homepage]”, 2020-05 ( ; similar)
“Reinforcement Learning With Augmented Data”, Et Al 2020
“Reinforcement Learning with Augmented Data”, 2020-04-30 ( ; similar)
“Learning to Fly via Deep Model-Based Reinforcement Learning”, Becker-Et Al 2020
“Learning to Fly via Deep Model-Based Reinforcement Learning”, 2020-03-19 ( ; similar)
“Introducing Dreamer: Scalable Reinforcement Learning Using World Models”, 2020
“Introducing Dreamer: Scalable Reinforcement Learning Using World Models”, 2020-03-18 (similar; bibliography)
“Learning to Prove Theorems by Learning to Generate Theorems”, 2020
“Learning to Prove Theorems by Learning to Generate Theorems”, 2020-02-17 ( ; backlinks; similar)
“The Gambler’s Problem and Beyond”, Et Al 2019
“The Gambler’s Problem and Beyond”, 2019-12-31 ( ; backlinks; similar)
“Dream to Control: Learning Behaviors by Latent Imagination”, Et Al 2019
“Dream to Control: Learning Behaviors by Latent Imagination”, 2019-12-03 (similar)
“Approximate Inference in Discrete Distributions With Monte Carlo Tree Search and Value Functions”, Et Al 2019
“Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value Functions”, 2019-10-15 ( ; backlinks; similar)
“Designing Agent Incentives to Avoid Reward Tampering”, Et Al 2019
“Designing agent incentives to avoid reward tampering”, 2019-08-14 ( ; backlinks; similar)
“An Application of Reinforcement Learning to Aerobatic Helicopter Flight”, Et Al 2019
“An Application of Reinforcement Learning to Aerobatic Helicopter Flight”, 2019-07-16 ( ; similar)
“When to Trust Your Model: Model-Based Policy Optimization (MOPO)”, Et Al 2019
“When to Trust Your Model: Model-Based Policy Optimization (MOPO)”, 2019-06-19 (backlinks; similar)
“Fast Task Inference With Variational Intrinsic Successor Features”, Et Al 2019
“Fast Task Inference with Variational Intrinsic Successor Features”, 2019-06-12 (similar)
“Bayesian Layers: A Module for Neural Network Uncertainty”, Et Al 2018
“Bayesian Layers: A Module for Neural Network Uncertainty”, 2018-12-10 ( ; similar)
“PlaNet: Learning Latent Dynamics for Planning from Pixels”, Et Al 2018
“PlaNet: Learning Latent Dynamics for Planning from Pixels”, 2018-11-12 (similar)
“Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning”, Et Al 2018
“Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning”, 2018-11-04 ( ; similar)
“Towards Automated Deep Learning: Efficient Joint Neural Architecture and Hyperparameter Search”, Et Al 2018
“Towards Automated Deep Learning: Efficient Joint Neural Architecture and Hyperparameter Search”, 2018-07-18 ( ; backlinks; similar)
“The Alignment Problem for Bayesian History-Based Reinforcement Learners”, 2018
“The Alignment Problem for Bayesian History-Based Reinforcement Learners”, 2018-06-22 ( ; similar; bibliography)
“Neural Scene Representation and Rendering”, Et Al 2018
“Neural scene representation and rendering”, 2018-06-15 ( ; similar)
“Mining Gold from Implicit Models to Improve Likelihood-free Inference”, Et Al 2018
“Mining gold from implicit models to improve likelihood-free inference”, 2018-05-30 ( ; backlinks; similar)
“Deep Reinforcement Learning in a Handful of Trials Using Probabilistic Dynamics Models”, Et Al 2018
“Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models”, 2018-05-30 (similar)
“Learning to Optimize Tensor Programs”, Et Al 2018
“Learning to Optimize Tensor Programs”, 2018-05-21 ( ; backlinks; similar)
“Estimate and Replace: A Novel Approach to Integrating Deep Neural Networks With Existing Applications”, Et Al 2018
“Estimate and Replace: A Novel Approach to Integrating Deep Neural Networks with Existing Applications”, 2018-04-24 ( ; similar)
“World Models”, 2018
“World Models”, 2018-03-27 (similar)
“Differentiable Dynamic Programming for Structured Prediction and Attention”, 2018
“Differentiable Dynamic Programming for Structured Prediction and Attention”, 2018-02-11 ( ; backlinks; similar)
“Generalization Guides Human Exploration in Vast Decision Spaces”, Et Al 2018
“Planning Chemical Syntheses With Deep Neural Networks and Symbolic AI”, Et Al 2018
“How to Explore Chemical Space Using Algorithms and Automation”, Et Al 2018
“Safe Policy Search With Gaussian Process Models”, Et Al 2017
“Safe Policy Search with Gaussian Process Models”, 2017-12-15 (similar)
“Analogical-based Bayesian Optimization”, Et Al 2017
“Analogical-based Bayesian Optimization”, 2017-09-19 ( ; backlinks; similar)
“A Game-Theoretic Analysis of the Off-Switch Game”, Et Al 2017
“A Game-Theoretic Analysis of the Off-Switch Game”, 2017-08-13 (similar)
“Neural Network Dynamics for Model-Based Deep Reinforcement Learning With Model-Free Fine-Tuning”, Et Al 2017
“Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning”, 2017-08-08 (similar)
“Learning Transferable Architectures for Scalable Image Recognition”, Et Al 2017
“Learning Transferable Architectures for Scalable Image Recognition”, 2017-07-21 ( ; similar)
“Learning Model-based Planning from Scratch”, Et Al 2017
“Learning model-based planning from scratch”, 2017-07-19 (similar)
“Value Prediction Network”, Et Al 2017
“Value Prediction Network”, 2017-07-11 (similar)
“Path Integral Networks: End-to-End Differentiable Optimal Control”, Et Al 2017
“Path Integral Networks: End-to-End Differentiable Optimal Control”, 2017-06-29 (similar)
“Visual Semantic Planning Using Deep Successor Representations”, Et Al 2017
“Visual Semantic Planning using Deep Successor Representations”, 2017-05-23 (similar)
“AIXIjs: A Software Demo for General Reinforcement Learning”, 2017
“AIXIjs: A Software Demo for General Reinforcement Learning”, 2017-05-22 ( ; backlinks; similar)
“DeepArchitect: Automatically Designing and Training Deep Architectures”, 2017
“DeepArchitect: Automatically Designing and Training Deep Architectures”, 2017-04-28 (backlinks; similar)
“Stochastic Constraint Programming As Reinforcement Learning”, Et Al 2017
“Stochastic Constraint Programming as Reinforcement Learning”, 2017-04-24 ( ; backlinks; similar)
“Recurrent Environment Simulators”, Et Al 2017
“Recurrent Environment Simulators”, 2017-04-07 ( ; similar)
“Prediction and Control With Temporal Segment Models”, Et Al 2017
“Prediction and Control with Temporal Segment Models”, 2017-03-12 ( ; similar)
“The Kelly Coin-Flipping Game: Exact Solutions”, Et Al 2017
“The Kelly Coin-Flipping Game: Exact Solutions”, 2017-01-19 ( ; backlinks; similar; bibliography)
“The Hippocampus As a Predictive Map”, Et Al 2017
“The hippocampus as a predictive map”, 2017 ( )
“The Predictron: End-To-End Learning and Planning”, Et Al 2016
“The Predictron: End-To-End Learning and Planning”, 2016-12-28 (similar)
“Model-based Adversarial Imitation Learning”, Et Al 2016
“Model-based Adversarial Imitation Learning”, 2016-12-07 (similar)
“DeepMath—Deep Sequence Models for Premise Selection”, Et Al 2016
“DeepMath—Deep Sequence Models for Premise Selection”, 2016-06-14 ( )
“Value Iteration Networks”, Et Al 2016
“Value Iteration Networks”, 2016-02-09 ( ; similar)
“Resorting Media Ratings”, 2015
“Resorting Media Ratings”, 2015-09-07 ( ; backlinks; similar; bibliography)
“Compress and Control”, Et Al 2014
“Compress and Control”, 2014-11-19 (similar)
“Whatever Next? Predictive Brains, Situated Agents, and the Future of Cognitive Science”, 2013
“Whatever next? Predictive brains, situated agents, and the future of cognitive science”, 2013-06-01 ( ; backlinks; similar)
“PUCT: Continuous Upper Confidence Trees With Polynomial Exploration-Consistency”, Et Al 2013
“PUCT: Continuous Upper Confidence Trees with Polynomial Exploration-Consistency”, 2013 ( ; backlinks; similar)
“Planning As Satisfiability: Heuristics”, 2012
“Planning as satisfiability: Heuristics”, 2012-12 ( ; backlinks; similar)
“A Monte Carlo AIXI Approximation”, Et Al 2009
“A Monte Carlo AIXI Approximation”, 2009-09-04 ( ; similar)
“Policy Mining: Learning Decision Policies from Fixed Sets of Data”, 2003
“Policy Mining: Learning Decision Policies from Fixed Sets of Data”, 2003 ( ; similar)
“The Speed Prior: A New Simplicity Measure Yielding Near-Optimal Computable Predictions”, 2002
“A Critique of Pure Reason”, 1987
“A critique of pure reason”, 1987-02-01 ( ; similar)
“Human Window on the World”, 1985
“Human Window on the World”, 1985 ( ; backlinks)
Wikipedia
Miscellaneous
Link Bibliography
-
https://arxiv.org/abs/2301.04104#deepmind
: “DreamerV3: Mastering Diverse Domains through World Models”, Danijar Hafner, Jurgis Pasukonis, Jimmy Ba, Timothy Lillicrap: -
https://www.nature.com/articles/s41467-022-35422-y
: “Merging Enzymatic and Synthetic Chemistry With Computational Synthesis Planning”, Itai Levin, Mengjie Liu, Christopher A. Voigt, Connor W. Coley: -
https://www.science.org/doi/10.1126/science.ade9097#facebook
: “CICERO: Human-level Play in the Game of <em>Diplomacy< / em> by Combining Language Models With Strategic Reasoning”, : -
https://arxiv.org/abs/2209.08466
: “Simplifying Model-based RL: Learning Representations, Latent-space Models, and Policies With One Objective (ALM)”, Raj Ghugare, Homanga Bharadhwaj, Benjamin Eysenbach, Sergey Levine, Ruslan Salakhutdinov: -
https://arxiv.org/abs/2207.05608#google
: “Inner Monologue: Embodied Reasoning through Planning With Language Models”, : -
https://arxiv.org/abs/2207.04429
: “LM-Nav: Robotic Navigation With Large Pre-Trained Models of Language, Vision, and Action”, Dhruv Shah, Blazej Osinski, Brian Ichter, Sergey Levine: -
https://arxiv.org/abs/2206.14176
: “DayDreamer: World Models for Physical Robot Learning”, Philipp Wu, Alejandro Escontrela, Danijar Hafner, Ken Goldberg, Pieter Abbeel: -
https://arxiv.org/abs/2206.11795#openai
: “Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos”, : -
https://arxiv.org/abs/2206.04114#google
: “Director: Deep Hierarchical Planning from Pixels”, Danijar Hafner, Kuang-Huei Lee, Ian Fischer, Pieter Abbeel: -
https://arxiv.org/abs/2204.05080#deepmind
: “Semantic Exploration from Language Abstractions and Pretrained Representations”, : -
https://arxiv.org/abs/2204.01691#google
: “Do As I Can, Not As I Say (SayCan): Grounding Language in Robotic Affordances”, : -
https://arxiv.org/abs/2106.13281#google
: “Brax—A Differentiable Physics Engine for Large Scale Rigid Body Simulation”, C. Daniel Freeman, Erik Frey, Anton Raichuk, Sertan Girgin, Igor Mordatch, Olivier Bachem: -
https://blog.waymo.com/2021/03/replaying-real-life.html
: “Replaying Real Life: How the Waymo Driver Avoids Fatal Human Crashes”, Waymo: -
https://ai.googleblog.com/2020/03/introducing-dreamer-scalable.html
: “Introducing Dreamer: Scalable Reinforcement Learning Using World Models”, Danijar Hafner: -
2018-everitt.pdf
: “The Alignment Problem for Bayesian History-Based Reinforcement Learners”, Tom Everitt, Marcus Hutter: -
coin-flip
: “The Kelly Coin-Flipping Game: Exact Solutions”, Gwern Branwen, Arthur Breitman, nshepperd, FeepingCreature, Gurkenglas: -
resorter
: “Resorting Media Ratings”, Gwern Branwen: