Bibliography (4):
https://arxiv.org/abs/2303.08774
https://openai.com/index/gpt-4-research/
MMLU: Measuring Massive Multitask Language Understanding
Wikipedia Bibliography:
Calibration (statistics)