Bibliography (3):

  1. Interpreting GPT: the Logit Lens

  2. Attention Is All You Need

  3. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding