https://research.google/blog/cross-modal-contrastive-learning-for-text-to-image-generation/
Contrastive Representation Learning: A Framework and Review
XMC-GAN: Cross-Modal Contrastive Learning for Text-to-Image Generation
Microsoft COCO: Common Objects in Context