Bibliography (6):

  1. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

  2. Measuring Mathematical Problem Solving With the MATH Dataset

  3. https://openai.com/index/gpt-4-research/

  4. https://openai.com/blog/chatgpt/

  5. LLaMa-1: Open and Efficient Foundation Language Models

  6. https://github.com/GanjinZero/math401-llm