Bibliography (5):
Attention Is All You Need
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
GPT-3: Language Models are Few-Shot Learners
Wikipedia Bibliography:
https://en.wikipedia.org/wiki/Recommendation_systems :
https://en.wikipedia.org/wiki/Recommendation_systems
Power law