Bibliography (4):
Proximal Policy Optimization Algorithms
https://github.com/hamishivi/EasyLM
https://github.com/allenai/open-instruct
https://huggingface.co/collections/allenai/tulu-v25-suite-66676520fd578080e126f618