Bibliography (3):
ImageNet Large Scale Visual Recognition Challenge
Compressive Transformers for Long-Range Sequence Modeling
Attention Is All You Need