Bibliography (4):

  1. ImageNet Large Scale Visual Recognition Challenge

  2. Vision Transformer: An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale

  3. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

  4. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding