Bibliography (4):

  1. VideoCoCa: Video-Text Modeling with Zero-Shot Transfer from Contrastive Captioners

  2. Contrastive Representation Learning: A Framework and Review

  3. CoCa: Contrastive Captioners are Image-Text Foundation Models