Bibliography (17):

  1. tank#alternative-examples

    [Transclude the forward-link's context]

  2. Proximal Policy Optimization Algorithms

  3. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

  4. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

  5. https://github.com/aypan17/reward-misspecification

  6. https://arxiv.org/pdf/2201.03544.pdf#page=6

  7. Unsolved Problems in ML Safety

  8. Flow: A Modular Learning Framework for Mixed Autonomy Traffic

  9. Reinforcement Learning for Optimization of COVID-19 Mitigation policies

  10. Deep Reinforcement Learning for Closed-Loop Blood Glucose Control

  11. Openai/gym: A Toolkit for Developing and Comparing Reinforcement Learning Algorithms.

  12. https://arxiv.org/pdf/2201.03544.pdf#page=3

  13. https://arxiv.org/pdf/2201.03544.pdf#page=4

  14. https://arxiv.org/pdf/2201.03544.pdf#page=2

  15. https://ai100.stanford.edu/gathering-strength-gathering-storms-one-hundred-year-study-artificial-intelligence-ai100-2021-study