ā€œInterpreting the Weight Space of Customized Diffusion Modelsā€, Amil Dravid, Yossi Gandelsman, Kuan-Chieh Wang, Rameen Abdal, Gordon Wetzstein, Alexei A. Efros, Kfir Aberman2024-06-13 ()⁠:

[homepage] We investigate the space of weights spanned by a large collection of customized diffusion models. We populate this space by creating a dataset of over 60,000 models, each of which is a base model fine-tuned to insert a different person’s visual identity [using LoRA]. We model the underlying manifold of these weights as a subspace [using PCA], which we term weights2weights.

We demonstrate 3 immediate applications of this space—sampling, editing, and inversion. First, as each point in the space corresponds to an identity, sampling a set of weights from it results in a model encoding a novel identity. Next, we find linear directions in this space corresponding to semantic edits of the identity (eg. adding a beard). These edits persist in appearance across generated samples.

Finally, we show that inverting a single image into this space reconstructs a realistic identity, even if the input image is out of distribution (eg. a painting).

Our results indicate that the weight space of fine-tuned diffusion models behaves as an interpretable latent space of identities.

[One of the most absurd ways yet to try to get a latent space z out of a diffusion model which is half as good as a GAN’s latent space… This one seems especially bad because it requires a ton of complex training upfront, and is still limited to the explicitly labeled attributes, rather than learning disentangled variables automatically?]