Bibliography (8):

GPT-3: Language Models are Few-Shot Learners
MMLU: Measuring Massive Multitask Language Understanding
PaLM: Scaling Language Modeling with Pathways
Measuring Mathematical Problem Solving With the MATH Dataset
PubMedQA: A Dataset for Biomedical Research Question Answering
Bigscience/bloom
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
Wikipedia Bibliography:
1. LaT_eX