-
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
-
Scaling Language Models: Methods, Analysis & Insights from Training Gopher
-
GPT-3: Language Models are Few-Shot Learners
-
Program Synthesis with Large Language Models
-
MMLU: Measuring Massive Multitask Language Understanding
-
Who Models the Models That Model Models? An Exploration of GPT-3’s In-Context Model Fitting Ability
-
The MovieLens Datasets: History and Context
-
Random_ai_poems.txt
-
https://tedunderwood.com/2021/02/02/why-sf-hasnt-prepared-us-to-imagine-machine-learning/
-