Bibliography (12):

https://x.com/siyan_zhao/status/1805277462890492321
https://arxiv.org/pdf/2406.11233#page=16
https://gwern.net/doc/ai/nn/fully-connected/2024-zhao-figure1-llmshavemuchrougherdecisionboundariesthanmlpsorsvmsordecisiontrees.png
LLaMA-2: Open Foundation and Fine-Tuned Chat Models
Mistral-7B
Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
https://openai.com/index/hello-gpt-4o/
GPT-3: Language Models are Few-Shot Learners
Wikipedia Bibliography: