“DualVAE: Controlling Colors of Generated and Real Images”, Keerth Rathakumar, David Liebowitz, Christian Walder, Kristen Moore, Salil S. Kanhere2023-05-30 ()⁠:

Colour controlled image generation and manipulation are of interest to artists and graphic designers. Vector Quantized Variational Autoencoders (VQ-VAEs) with autoregressive (AR) prior are able to produce high quality images, but lack an explicit representation mechanism to control color attributes. We introduce DualVAE, a hybrid representation model that provides such control by learning disentangled representations for color and geometry. The geometry is represented by an image intensity mapping that identifies structural features. The disentangled representation is obtained by two novel mechanisms:

(1) a dual branch architecture that separates image color attributes from geometric attributes, and (2) a new ELBO that trains the combined color and geometry representations. DualVAE can control the color of generated images, and recolor existing images by transferring the color latent representation obtained from an exemplar image. We demonstrate that DualVAE generates images with FID nearly two times better than VQ-GAN on a diverse collection of datasets, including animated faces, logos and artistic landscapes.