Bibliography (13):

  1. https://huggingface.co/laion/CLIP-ViT-bigG-14-laion2B-39B-b160k

  2. An Open Source Implementation of CLIP

  3. ImageNet: A Large-Scale Hierarchical Image Database

  4. Microsoft COCO: Common Objects in Context

  5. Stable Diffusion Public Release

  6. Vision Transformer: An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale

  7. ImageNet Large Scale Visual Recognition Challenge

  8. https://arxiv.org/abs/2212.00794

  9. Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time

  10. Reproducible scaling laws for contrastive language-image learning

  11. https://laion.ai/blog/large-openclip/

  12. Training Deep Nets with Sublinear Memory Cost