Jonathan Chang (e/acc) · Sep 12, 2022 · 4:09 PM UTC

Jonathan Chang (e/acc) · Sep 12, 2022 · 4:09 PM UTC

Jonathan Chang (e/acc)

12 Sep 2022

Today I released the code used to train Anim·E #Anim_E, Anime-enhanced #dallemini It’s easy train, and dramatically improves dalle-mini’s ability to generate anime images. With a few lines of code change, I also applied it to stable-diffusion. Here is how it works:🧵

Sep 12, 2022 · 4:09 PM UTC

Jonathan Chang (e/acc) · Sep 12, 2022 · 4:09 PM UTC

Jonathan Chang (e/acc) @cccntu

12 Sep 2022

Dalle-mini (and Dall·E 1) works like this: text tokens-> [transformer] -> image tokens -> [VQGAN decoder] -> image But the @Craiyon has been updating the transformer model only. The VQGAN was trained on a relatively small dataset and not updated.

Jonathan Chang (e/acc) · Sep 12, 2022 · 4:09 PM UTC

Jonathan Chang (e/acc) @cccntu

12 Sep 2022

So, by updating the VQGAN decoder, we can unleash the true capability of the transformer model. The code is written in Jax. Using the model implementations from @borisdayma @psuraj28 @pcuenq. Thanks!

Jonathan Chang (e/acc) · Sep 12, 2022 · 4:09 PM UTC

Jonathan Chang (e/acc) @cccntu

12 Sep 2022

For #stabledifussion , the VAE decoder is a lot better at reconstructing arbitrary images, thanks to the much smaller compression ratio, v.s. VQGAN. But it can still benefit from additional fine-tuning. Especially when fine-tuning it with textual-inversion or deambooth.

Jonathan Chang (e/acc) · Sep 12, 2022 · 4:13 PM UTC

Jonathan Chang (e/acc) @cccntu

12 Sep 2022

The code is available at github.com/cccntu/fine-tune-…

GitHub - cccntu/fine-tune-models

Contribute to cccntu/fine-tune-models development by creating an account on GitHub.

github.com

Jonathan Chang (e/acc) · Sep 16, 2022 · 2:32 PM UTC

Jonathan Chang (e/acc) @cccntu

16 Sep 2022

Tip: When training discriminator, try discretize fake images to int then back to float, so discriminator doesn't learn to exploit the leaked information.

GitHub - cccntu/fine-tune-models

Beyond artificial intelligence: Machines Learning Beauty (Tech for the Non-Techie)