Bibliography (8):

Linear Representations of Sentiment in Large Language Models
https://reasoning-tokens.ghost.io/reasoning-tokens/
Vision Transformers Need Registers
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
https://rajpurkar.github.io/SQuAD-explorer/
https://www.tau-nlp.org/commonsenseqa
Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Wikipedia Bibliography:
1. https://en.wikipedia.org/wiki/C4_(dataset) :
  
  https://en.wikipedia.org/wiki/C4_(dataset)