Bibliography (3):

  1. MMLU: Measuring Massive Multitask Language Understanding

  2. https://github.com/EQ-bench/EQ-Bench

  3. https://eqbench.com/