Bibliography (7):

  1. https://x.com/GhugareRaj/status/1572228478115934209

  2. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

  3. When to Trust Your Model: Model-Based Policy Optimization (MOPO)

  4. Randomized Ensembled Double Q-Learning (REDQ): Learning Fast Without a Model

  5. TD3: Addressing Function Approximation Error in Actor-Critic Methods