“‘Offline RL’ Tag”,2020-05-04 ():
![]()
Bibliography for tag
reinforcement-learning/offline, most recent first: 3 related tags, 52 annotations, & 10 links (parent).
- See Also
- Links
- “Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data”, et al 2024
- “Dataset Reset Policy Optimization for RLHF”, et al 2024
- “Mastering Stacking of Diverse Shapes With Large-Scale Iterative Reinforcement Learning on Real Robots”, et al 2023
- “Vision-Language Models As a Source of Rewards”, et al 2023
- “Beyond Human Data: Scaling Self-Training for Problem-Solving With Language Models (ReSTEM)”, et al 2023
- “Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations”, et al 2023
- “Course Correcting Koopman Representations”, et al 2023
- “Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions”, et al 2023
- “Subwords As Skills: Tokenization for Sparse-Reward Reinforcement Learning”, et al 2023
- “What Are Dreams For? Converging Lines of Research Suggest That We Might Be Misunderstanding Something We Do Every Night of Our Lives”, 2023
- “ReST: Reinforced Self-Training (ReST) for Language Modeling”, et al 2023
- “AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning”, et al 2023
- “Learning to Model the World With Language”, et al 2023
- “Provable Guarantees for Generative Behavior Cloning: Bridging Low-Level Stability and High-Level Behavior”, et al 2023
- “PASTA: Pretrained Action-State Transformer Agents”, et al 2023
- “Fighting Uncertainty With Gradients: Offline Reinforcement Learning via Diffusion Score Matching”, et al 2023
- “Twitching in Sensorimotor Development from Sleeping Rats to Robots”, et al 2023
- “Survival Instinct in Offline Reinforcement Learning”, et al 2023
- “BetaZero: Belief-State Planning for Long-Horizon POMDPs Using Learned Approximations”, et al 2023
- “Improving Language Models With Advantage-Based Offline Policy Gradients”, et al 2023
- “Revisiting the Minimalist Approach to Offline Reinforcement Learning”, et al 2023
- “Think Before You Act: Unified Policy for Interleaving Language Reasoning With Actions”, et al 2023
- “Off-The-Grid MARL (OG-MARL): Datasets With Baselines for Offline Multi-Agent Reinforcement Learning”, et al 2023
- “Offline Q-Learning on Diverse Multi-Task Data Both Scales And Generalizes”, et al 2022
- “Dungeons and Data: A Large-Scale NetHack Dataset”, et al 2022
- “In-Context Reinforcement Learning With Algorithm Distillation”, et al 2022
- “CORL: Research-Oriented Deep Offline Reinforcement Learning Library”, et al 2022
- “Diffusion-QL: Diffusion Policies As an Expressive Policy Class for Offline Reinforcement Learning”, et al 2022
- “Offline RL Policies Should Be Trained to Be Adaptive”, et al 2022
- “Prompting Decision Transformer for Few-Shot Policy Generalization”, et al 2022
- “Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos”, et al 2022
- “Large-Scale Retrieval for Reinforcement Learning”, et al 2022
- “Offline RL for Natural Language Generation With Implicit Language Q Learning”, et al 2022
- “When Does Return-Conditioned Supervised Learning Work for Offline Reinforcement Learning?”, et al 2022
- “Newton’s Method for Reinforcement Learning and Model Predictive Control”, 2022
- “You Can’t Count on Luck: Why Decision Transformers Fail in Stochastic Environments”, et al 2022
- “Multi-Game Decision Transformers”, et al 2022
- “When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?”, et al 2022
- “Don’t Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement Learning (ExORL)”, et al 2022
- “Offline Pre-Trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks”, et al 2021
- “A Workflow for Offline Model-Free Robotic Reinforcement Learning”, et al 2021
- “Conservative Objective Models for Effective Offline Model-Based Optimization”, et al 2021
- “A Minimalist Approach to Offline Reinforcement Learning”, 2021
- “Is Pessimism Provably Efficient for Offline RL?”, et al 2020
- “What Are the Statistical Limits of Offline RL With Linear Function Approximation?”, et al 2020
- “MOPO: Model-Based Offline Policy Optimization”, et al 2020
- “Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems”, et al 2020
- “D4RL: Datasets for Deep Data-Driven Reinforcement Learning”, et al 2020
- “Q✱ Approximation Schemes for Batch Reinforcement Learning: A Theoretical Comparison”, 2020
- “Scaling Data-Driven Robotics With Reward Sketching and Batch Reinforcement Learning”, et al 2019
- “QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation”, et al 2018
- “The Netflix Recommender System”, Gomez-2015
- Wikipedia
- Miscellaneous
- Bibliography