Bibliography (8):

  1. Attention Is All You Need

  2. https://arxiv.org/abs/2202.06281

  3. Layer Normalization

  4. https://arxiv.org/abs/2110.13784

  5. SuperGLUE Benchmark