Bibliography (7):

  1. ImageNet Large Scale Visual Recognition Challenge

  2. Vision Transformer: An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale

  3. ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases

  4. DeepViT: Towards Deeper Vision Transformer

  5. Towards Learning Convolutions from Scratch

  6. Finding the Needle in the Haystack with Convolutions: on the benefits of architectural bias

  7. Homotopy Analysis for Tensor PCA