Bibliography (5):

  1. Attention Is All You Need

  2. T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

  3. Vision Transformer: An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale

  4. MLP-Mixer: An all-MLP Architecture for Vision

  5. Wikipedia Bibliography:

    1. Rectifier (neural networks)