https://www.reddit.com/r/mlscaling/comments/1g9r65z/introducing_computer_use_a_new_claude_35_sonnet/lt89drx/
https://www.reddit.com/r/slatestarcodex/comments/1gsv897/gwern_on_the_diminishing_returns_to_scaling_and/
https://www.reddit.com/r/mlscaling/comments/1gswayg/gwern_on_the_diminishing_returns_to_scaling_and/
Chinchilla: Training Compute-Optimal Large Language Models
Wikipedia Bibliography: