Bibliography (5):

https://openai.com/blog/chatgpt/
GPT-3: Language Models are Few-Shot Learners
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Wikipedia Bibliography:
1. ROUGE (metric)
2. https://en.wikipedia.org/wiki/F-score :
  
  https://en.wikipedia.org/wiki/F-score