âSelf-Conditioned Image Generation via Generating Representationsâ, 2023-12-06 (; backlinks)â :
This paper presents Representation-Conditioned image Generation (RCG), a simple yet effective image generation framework which sets a new benchmark in class-unconditional image generation. RCG does not condition on any human annotations. Instead, it conditions on a self-supervised representation distribution [using MoCov3] which is mapped from the image distribution using a pre-trained encoder.
During generation, RCG samples from such representation distribution using a representation diffusion model (RDM), and employs a pixel generator to craft image pixels conditioned on the sampled representation. Such a design provides substantial guidance during the generative process, resulting in high-quality image generation.
Tested on ImageNet 256Ă256, RCG achieves a FrĂ©chet Inception Distance (FID) of 3.31 and an Inception Score (IS) of 253.4. These results not only improve the state-of-the-art of class-unconditional image generation but also rival the current leading methods in class-conditional image generation, bridging the long-standing performance gap between these two tasks.
Code is available at Github.