-
CogView: Mastering Text-to-Image Generation via Transformers
-
https://agc.platform.baai.ac.cn/CogView/index.html
-
https://model.baai.ac.cn/model-detail/100041
-
MAE: Masked Autoencoders Are Scalable Vision Learners
-
Hierarchical Text-Conditional Image Generation with CLIP Latents
-
https://github.com/THUDM/CogView2
-