“Towards Generated Image Provenance Analysis Via Conceptual-Similar-Guided-SLIP Retrieval”, 2024-04-16 ():
With the prevalence of state-of-the-art generative models, photorealistic synthetic images can now be easily generated. However, the generated images may replicate contents from the original training images, which can lead to potential legal issues.
In this paper, we propose a novel method called Conceptual-Similar-guided Self-supervised Language-Image Pre-training (CS-SLIP) that leverages both image and text modalities for the generated image provenance. Besides the self-supervised learning branch and contrastive learning branch, a conceptual-similar branch is designed to guide the model to learn a better feature representation of image-text-pairs. We also adopt the re-ranking method to refine the initial matching candidates via the cross-modal bi-directional retrieval.
Extensive qualitative and quantitative experiments are conducted, which demonstrate that the replication indeed exists in the generated images, and our proposed method can effectively retrieve the most similar images from the training corpus to achieve the goal of generated image provenance analysis.
[Keywords: generated image provenance, image retrieval, cross-modal]