Bibliography (5):

  1. DeepSeek-V3 Technical Report

  2. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

  3. MMLU: Measuring Massive Multitask Language Understanding

  4. https://openai.com/index/gpt-4-research/

  5. Wikipedia Bibliography:

    1. Reinforcement learning