Bibliography (10):

Bigscience/bloom
OPT: Open Pre-trained Transformer Language Models
T0: Multitask Prompted Training Enables Zero-Shot Task Generalization
GPT-J-6B: 6B JAX-Based Transformer
https://github.com/HazyResearch/ama_prompting
Scaling Laws for Neural Language Models
Chinchilla: Training Compute-Optimal Large Language Models
Emergent Abilities of Large Language Models
EleutherAI
Wikipedia Bibliography:
1. Entropy (information theory)