β βend-to-endβ directory
Vision Transformer: An Image is Worth 16Γ16 Words: Transformers for Image Recognition at Scale
Language Models are Unsupervised Multitask Learners
ImageNet Large Scale Visual Recognition Challenge
https://github.com/EleutherAI/openwebtext
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding