-
https://huggingface.co/laion/CLIP-ViT-bigG-14-laion2B-39B-b160k
-
An Open Source Implementation of CLIP
-
ImageNet: A Large-Scale Hierarchical Image Database
-
Microsoft COCO: Common Objects in Context
-
Stable Diffusion Public Release
-
Vision Transformer: An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale
-
ImageNet Large Scale Visual Recognition Challenge
-
https://arxiv.org/abs/2212.00794
-
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
-
Reproducible scaling laws for contrastive language-image learning
-
https://laion.ai/blog/large-openclip/
-
Training Deep Nets with Sublinear Memory Cost
-