Bibliography (7):

LLaMa-1: Open and Efficient Foundation Language Models
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
‘end-to-end’ directory
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
https://www.together.ai/blog/redpajama-models-v1
https://github.com/openlm-research/open_llama
Wikipedia Bibliography:
1. https://en.wikipedia.org/wiki/Pruning_(artificial_neural_network) :
  
  https://en.wikipedia.org/wiki/Pruning_(artificial_neural_network)