“Diffusion Models Beat GANs on Image Synthesis”, Prafulla Dhariwal, Alex Nichol2021-05-11 (, ; similar)⁠:

We show that diffusion models can achieve image sample quality superior to the current state-of-the-art generative models.

We achieve this on unconditional image synthesis by finding a better architecture through a series of ablations. For conditional image synthesis, we further improve sample quality with classifier guidance: a simple, compute-efficient method for trading off diversity for sample quality using gradients from a classifier.

We achieve an FID of 2.97 on ImageNet 128×128, 4.59 on ImageNet 256×256, and 7.72 on ImageNet 512×512, and we match BigGAN-deep even with as few as 25 forward passes per sample, all while maintaining better coverage of the distribution. [nearest-neighbors]

Finally, we find that classifier guidance combines well with upsampling diffusion models, further improving FID to 3.85 on ImageNet 512×512.

We release our code [and checkpoints].