“Reference-Based Image Composition With Sketch via Structure-Aware Diffusion Model”, Kangyeol Kim, Sunghyun Park, Junsoo Lee, Jaegul Choo2023-03-31 (, )⁠:

Recent remarkable improvements in large-scale text-to-image generative models have shown promising results in generating high-fidelity images.

To further enhance editability and enable fine-grained generation, we introduce a multi-input-conditioned image composition model that incorporates a sketch as a novel modal, alongside a reference image. Thanks to the edge-level controllability using sketches, our method enables a user to edit or complete an image sub-part with a desired structure (ie. sketch) and content (ie. reference image). Our framework fine-tunes a pre-trained diffusion model to complete missing regions using the reference image while maintaining sketch guidance. Albeit simple, this leads to wide opportunities to fulfill user needs for obtaining the in-demand images.

Through extensive experiments, we demonstrate that our proposed method offers unique use cases for image manipulation, enabling user-driven modifications of arbitrary scenes.