“John Schulman’s Homepage”, John Schulman (Anthropic, model-free RL, OA, preference learning; backlinks)
View HTML:
John Schulman’s Homepage