Similar Links:
A Solvable Model of Neural Scaling Laws
Scaling Laws for Neural Language Models
Scaling Laws for Autoregressive Generative Modeling
Multi-Game Decision Transformers
WebGPT: Browser-assisted question-answering with human feedback
Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
Unified View of Grokking, Double Descent and Emergent Abilities: A Perspective from Circuits Competition
Ethan Caballero on Private Scaling Progress
Extrapolating GPT-N performance
Search: GS; Google; site