https://colab.research.google.com/drive/1z1Sy7HXWPY8R295tNA-UrFYLfnBe0okl
CogView: Mastering Text-to-Image Generation via Transformers
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
RUDOLPH: One Hyper-Tasking Transformer Can Be Creative As DALL-E and GPT-3 and Smart As CLIP
Unifying Multimodal Transformer for Bi-directional Image and Text Generation
Conceptual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic Image Captioning
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
VQ-GAN: Taming Transformers for High-Resolution Image Synthesis
Wikipedia Bibliography: