Bibliography (6):

https://github.com/allenai/rl4lms
https://rl4lms.apps.allenai.org/
Proximal Policy Optimization Algorithms
Wikipedia Bibliography:
1. Reinforcement learning
2. Open source
3. Hugging Face