Bibliography (5):

Attention Is All You Need
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
GPT-3: Language Models are Few-Shot Learners
Wikipedia Bibliography:
1. https://en.wikipedia.org/wiki/Recommendation_systems :
  
  https://en.wikipedia.org/wiki/Recommendation_systems
2. Power law