“Hierarchical Autoregressive Image Models With Auxiliary Decoders”, 2019-03-06 ():
Autoregressive generative models of images tend to be biased towards capturing local structure, and as a result they often produce samples which are lacking in terms of large-scale coherence.
To address this, we propose two methods [using PixelCNN] to learn discrete representations of images which abstract away local detail. We show that autoregressive models conditioned on these representations can produce high-fidelity reconstructions of images, and that we can train autoregressive priors on these representations that produce samples with large-scale coherence. We can recursively apply the learning procedure, yielding a hierarchy of progressively more abstract image representations.
We train hierarchical class-conditional autoregressive models on the ImageNet dataset and demonstrate that they are able to generate realistic images at resolutions of 128×128 and 256×256 pixels.
We also perform a human evaluation study comparing our models with both adversarial and likelihood-based state-of-the-art generative models.