Bibliography (3):

  1. ImageNet Large Scale Visual Recognition Challenge

  2. Compressive Transformers for Long-Range Sequence Modeling

  3. Attention Is All You Need