-
Microsoft COCO: Common Objects in Context
-
ImageInWords: Unlocking Hyper-Detailed Image Descriptions
-
Evaluating Text-to-Visual Generation with Image-to-Text Generation
-
https://github.com/google-deepmind/proactive_t2i_agents
-
https://www.youtube.com/watch?v=HQgjLWp4Lo8