Broken Neural Scaling Laws
https://ethancaballero.github.io/
https://scholar.google.com/citations?user=KvLJAf0AAAAJ
https://x.com/ethanCaballero
GPT-3: Language Models are Few-Shot Learners
Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Chinchilla: Training Compute-Optimal Large Language Models
Introducing Adept
https://www.cnbc.com/2022/03/29/inflection-ai-reid-hoffmans-start-up-poaches-staff-from-google-meta.html
https://bmk.sh/
Attention Is All You Need
Scaling Laws for Autoregressive Generative Modeling