×
all 119 comments

[–]Nitrosocke[S] 94 points95 points  (14 children)

New model trained on images from classic animation studios. Go grab it on huggingface here:
https://huggingface.co/nitrosocke/classic-anim-diffusion

Hope you enjoy and feel free to share your results!

[–]totallydiffused 8 points9 points  (1 child)

Your models are amazing. Do you mind a couple of questions, also what are your settings for --lr_scheduler and --learning_rate ?

Finally, HUGE thanks for sharing this with everyone, your high quality models are an inspiration.

edit: saw you already answered my first question

[–]Nitrosocke[S] 11 points12 points  (0 children)

Thank you for your kind words!
And as you already saw in my other post, lr of 1e-6 and constant are my go to settings now :)

[–]g5reddit 8 points9 points  (1 child)

u/Nitrosocke Thank you very much for all of your work,time and these amazing models. If it is not too much to ask can you share a workflow of yours like which repo are you using to train dreambooth,instance prompts,class promts,settings like how many steps,batch size and if it changes training a single person or a style? It may sound a lot like asking for dreambooth training from scratch,but a proper way of doing it like yours can help us a lot if you don't mind and if you already shared a workflow can you link me? Edit: I have seen you have shared your guide in here : https://github.com/nitrosocke/dreambooth-training-guide So again thanks for your generosity for everything.

[–]Nitrosocke[S] 4 points5 points  (0 children)

You're very welcome! I hope it covers all your questions!

[–]NateBerukAnjing 6 points7 points  (0 children)

amazing!! thanks

[–]thatguitarist[🍰] 5 points6 points  (2 children)

[–]harrro 0 points1 point  (1 child)

Can you do one with the Blue Steel look?

[–]HuffleMcSnufflePuff 1 point2 points  (0 children)

not yet... you gotta train the beast before you let it out of its cage.

[–]h_saxon 2 points3 points  (1 child)

Dude, you're awesome, and I want to be your friend.

[–]Nitrosocke[S] 1 point2 points  (0 children)

Hahaha, thanks :)
Feel free to hang out in the SD discord server

[–]Pashahlis 2 points3 points  (2 children)

I am vlose to finishing my Korra model but have no idea how to upload it to hugging face. Any chance you could explain how?

[–]Nitrosocke[S] 1 point2 points  (1 child)

If it is a single ckpt file I can help you. Uploading a folder structure is something I couldn't get to work either.
You would want to create a new Model on hugginface, select the name and license and after that you can upload files in the "Files and version" tab

[–]Pashahlis 1 point2 points  (0 children)

Yes its a single ckpt file

[–]WhiteZero 30 points31 points  (1 child)

Everyone won't be making a goofy confused face, right? 🤣

[–]Nitrosocke[S] 16 points17 points  (0 children)

hope not! but there might be a bias as well, since these styles usually amplify these emotions by a lot. negative prompts are your friend there

[–]no_witty_username 18 points19 points  (22 children)

Hey bud great models. How many images do you use in your data sets? Do you just throw in any images or focus on character images over scenery, objects, etc...? If you could do a write up on your process that be great. I have been making my own models for the last 5 days and learning a lot on what works well and what doesn't, but a good write up from someone would help in saving a lot of time in making good models I can actually feel proud of in releasing to the community.

[–]Nitrosocke[S] 26 points27 points  (21 children)

Hi there and thanks!
I made a short guide on these topics here:
https://github.com/nitrosocke/dreambooth-training-guide
Should cover all these questions, but feel free to ask if anything is missing in there.

[–]no_witty_username 6 points7 points  (10 children)

Ok I read its a pretty short right up. I want some clarification please. Says here "but I've never used more than 2000 reg images". So for example your disney data set. How many total images you used to train that model. Is there a sweet spot you find, in total image count that you prefer to stay in. How many steps per image size? Are most of the imaes characters or is it a mix between that and scenery and random objects?

[–]Nitrosocke[S] 7 points8 points  (9 children)

reg images are not the images used for training, these are used for the proir preservation loss and mostly called "class images" now
For the images to train on ("instance images" in my repo) I like to use ~100 and they are like 70% people, 20% landscapes and 10% animals/objects
I use around 100 training steps per image

[–]cianuro 3 points4 points  (2 children)

Do you find that the more instance images you use the better the results or is it marginal returns after about 100?

Thank you for all the effort by the way. My kid is having great fun with the modern Disney one.

[–]Nitrosocke[S] 1 point2 points  (1 child)

That makes me really happy to hear! Glad your kid enjoys the renders from it :)

Using more images for training makes the model more flexible. So the more people, scenes, objects or animals you add the model can give you images of these subjects more reliably. It also means it needs more steps for training and biases could be trained in with repeating images

[–]neltherion 0 points1 point  (0 children)

Thanks for the info.

How many steps would you say is proper for more than a hundred images?

[–]no_witty_username 1 point2 points  (0 children)

Great that's what i needed to know. I am using a different repo as I am testing out the capabilities for that repo to use multiple subject training at once and call in separate subjects with unique tags. It doesn't have many of the settings you speak of. But the step couldn't and whole image database count helps a lot to know

[–]alumiqu 1 point2 points  (1 child)

If you didn't use prior preservation loss (and so didn't have regularization images), would the trained model work as well on "classic disney style" prompts?

(My understanding is that this would reduce performance for other prompts, but I'm not sure how it affects "classic disney style" prompts.)

Thanks!

[–]Nitrosocke[S] 0 points1 point  (0 children)

It should work just as good on the prompt, but it will bleed through into other styled of the model. For example when prompting "elsa from disney" it might make it in this style

[–]dreamer_2142 1 point2 points  (2 children)

proir preservation loss

How many reg images you used here for 100 sample?

[–]Nitrosocke[S] 1 point2 points  (1 child)

1000 reg images

[–]dreamer_2142 0 points1 point  (0 children)

Thanks!

[–]no_witty_username 1 point2 points  (0 children)

thanks ill check it out.

[–]bokluhelikopter 1 point2 points  (2 children)

Hi i admire your models and want train my own models. May i ask some more in depth questions to you? What lr_scheduler do you use? What learning rate do you use? What lr_warm-up steps do you use?

[–]Nitrosocke[S] 1 point2 points  (1 child)

Hi there!
I've come to use only "constant" as the scheduler, as a user here posted a comparison and for these dreambooth methods it makes no difference in training.
LR 1e-6 and 0 warmup steps are my go to now. I used 10% of total training steps as warmup before but couldn't see any difference

[–]bokluhelikopter 1 point2 points  (0 children)

Thanks

[–]jonesaid 1 point2 points  (2 children)

This is a great guide! Thank you! What would you change in your setup if you were training a specific subject or object instead of a style? Or is Dreambooth not good for that?

[–]Nitrosocke[S] 1 point2 points  (1 child)

it is very good for that. I'm no expert on that since I specialized in style training.
You will need another set of class/reg images and your samples images. For a person 10 images are already enough. You can use the class prompt "photo person" and the instance prompt "[name] person" where the name is yours or the person you're training.

[–]jonesaid 0 points1 point  (0 children)

How did you install Shivam's repo to work locally? Is there a tutorial?

[–]badadadok 1 point2 points  (0 children)

Thanks for the guide bro, been wanting to do some training.

[–]topdeck55 15 points16 points  (2 children)

Is that "Hellen Mirren as the Queen"?

[–]Nitrosocke[S] 5 points6 points  (1 child)

Good eye! glad she's so recognizable here!

[–]wyldphyre 2 points3 points  (0 children)

She was also pretty awesome back in the day as Morgana le Fay in "Excalibur".

[–]Striking-Long-2960 8 points9 points  (1 child)

You are becoming a legend. Many thanks.

[–]Nitrosocke[S] 7 points8 points  (0 children)

Thank you! Hope this lives up to it then :)

[–]ciavolella 7 points8 points  (1 child)

This dude never slows down! Keep up the excellent (and I really mean excellent) work. Thanks again for another awesome dataset.

[–]Nitrosocke[S] 3 points4 points  (0 children)

Thank you so much for your appreciation! Comments like yours keep me going :)

[–]LifeLiterate 6 points7 points  (1 child)

Wow, these are SPOT ON.

[–]Applejinx 7 points8 points  (0 children)

I would imagine anything where you could put in EVERY FRAME of an animated film, would give you a hell of a lot of reference for all the objects at all the angles and situations you could ask for. Just sayin'…

[–]LordGorzul 3 points4 points  (1 child)

Youre doing god's work

Tbh I didnt try them yet, but if its anythign like youre example images then yea.

[–]Nitrosocke[S] 2 points3 points  (0 children)

There was a little cherry picking involved and its not as solid as my previous model. But I keep improving it until it's just as good!

[–]NateBerukAnjing 4 points5 points  (1 child)

what's the prompt for the car and the house

[–]Nitrosocke[S] 7 points8 points  (0 children)

classic disney style fiat abarth

Negative prompt: person
and
classic disney style windmill house on a green hill

[–]lumenwrites 3 points4 points  (3 children)

Wow, this is amazing.

I'm new to SD, I have a question - how hard would it be to retrain/finetune these models on modern animation like Gravity Falls, Star Trek: Lower Decks, Final Space? How long would it take?

[–]Nitrosocke[S] 3 points4 points  (2 children)

Thank you!
The hardest part would be the data set collection for all of these. The training times depend on your setup/method.
Great shows by the way and Gravity Falls is on my list as well :)

If you want, you can check out my short guide on my training process here:
https://github.com/nitrosocke/dreambooth-training-guide

[–]lumenwrites 3 points4 points  (1 child)

data set collection for all of these

All of them are available on torrents) Or do you need more than just the frames from the shows?

If you want, you can check out my short guide on my training process here:

Thank you, I'll check it out!

[–]Nitrosocke[S] 4 points5 points  (0 children)

I can recommend https://fancaps.net/ for getting the images easier but making them yourself is an option as well ofc

[–]Due_Recognition_3890 4 points5 points  (4 children)

Are models like this possible to make on Astria? Given its so easy, it's tempting to use it as my only Dreambooth model.

[–]Nitrosocke[S] 3 points4 points  (3 children)

I never tried their service but it looked like it's more focused on training persons with a few images?
This model used ~100 images for the style training, don't know if Astria can work with that.

[–]Due_Recognition_3890 2 points3 points  (2 children)

Hehe I think I trained a model on it the other day which was about 118 images, on Streets of Rage 4 sprites, wasn't as good as I wanted it to be though but good enough to give me some amusement. Originally I wanted to use all 3000+ but it wouldn't let me upload all those.

[–]Nitrosocke[S] 1 point2 points  (1 child)

Good to know and sounds like a nice model to train!
Do they have details on their training process public?

[–]Due_Recognition_3890 2 points3 points  (0 children)

They let you change steps but the placeholder text says 500-6000, don't know if that's their limit or just an example. There a also something that says "Branch" but no idea what that is. They say the training takes about an hour and a half, I've never measured times myself. They offer the trained model as a download, but to use it on Automatic 1111, you have to disable some safety something, something to do with pickles.

Strmr.com is the URL

[–]Barnowl1985 2 points3 points  (0 children)

Ok, this is awesome

[–]Geodude333 2 points3 points  (3 children)

fantastic stuff. especially top right 3rd photos. Reminds me of the softly painting background that older disney and warners brothers would have for wide shots in contrast to the characters.

[–]Nitrosocke[S] 2 points3 points  (2 children)

Thank you! Its so much fun playing around with it and seeing thing in the style that never existed, like modern cars or futuristic machines.

[–]Geodude333 2 points3 points  (1 child)

True. One of the things I like most about AI is how people are taking artists who are long dead and reviving their styles with this new tech.

Styles like Rococo, Cubism, Surrealism and Baroque art that have otherwise shrunk away or been somewhat run over by Modern and Post-Modern art, cannot be revived by a single artist, but can now be revitalized and renewed with software.

We can also see things that would never have been seen before, like one of kind collector cars, but in a variety of colors that would otherwise never be approved by committees of snobby executives who take only cautious steps in favor of retaining the brand.

[–]Nitrosocke[S] 1 point2 points  (0 children)

Yeah, I'm sure we're all just scratching the surface of whats possible with this amazing technology. Looking forward to this ai art renaissance!

[–]NeoPossum 2 points3 points  (2 children)

I want an AWD Fiat 500 now

[–]Andrew_hl2 4 points5 points  (0 children)

That would be the fiat 500x lol.

[–]cringy_flinchy 0 points1 point  (0 children)

the electric one is probably AWD

[–]dreamer_2142 2 points3 points  (3 children)

Hi, thanks a gain, I have some question I would appreciate if you could help me with,
1-Can you give me some tips on how to train a style? I tried yesterday and I failed miserably.
What should I set the "class_prompt" to? should I set it to "artwork style" or just a "style"?
2- You sure 512x512 is the best? I tried yesterday original 512x678 (source images) instead of cropping it and it gave me a better result. so I'm not sure if 512x512 is a myth or not. maybe if we want to train a person, 512*768 would be a better choice? can you do an experiment to check this point?
3- You say on your guide that regularization images is a way to prevent bleeding, but yesterday Instead of "person" as a class I used "beautiful person", and my result was way better. so I'm not sure what's going on. I think reg images has a big impact on the result not just keeping the new data from bleeding.
4- I see you shared artwork style as part of your data set, is it based on 1.4 or 1.5?

Thanks!

[–]Nitrosocke[S] 3 points4 points  (1 child)

Okay her we go:
1 - the class prompt can be "artwork style" for example and then instance prompt should be the name of your style, for example "xyz style"
2 - I never tried anything larger than 512x512, so I can't say anything to that.
I found that fine tuning improves the overall quality and composition as well, so maybe that's where that comes from. But it could very well be the higher resolution.
3 - If the class prompt and instance prompt is not setup accordingly it actually trains on both your class and instance images. That could be a explanation for that.
4 - The shared reg images are all from SD 1.4 and I would recommend making your own with 1.5
5 - class - "illustration style" and instance - "modern disney style"

[–]dreamer_2142 1 point2 points  (0 children)

Thank you so much!

[–]dreamer_2142 2 points3 points  (0 children)

I forgot I have one more question,
5- What were the instance_prompt and the class_prompt you used for the modern Disney style?

Sorry for the long list, but you will save me and others a lot of time by answering these questions.

[–]whataweirdguy 1 point2 points  (2 children)

Stupid question and I’m new to SD… does this work in 1.5 local?

[–]extremesalmon 4 points5 points  (1 child)

I got it to work by putting the model file in the stable-diffusion folder under /models

then you can choose the model from the drop down top left.

[–]whataweirdguy 0 points1 point  (0 children)

Thank you!

[–]eat-more-bookses 1 point2 points  (1 child)

First, amazing, thank you!

Wrong place to ask, but, can Dreambooth support multiple classes? E.g., could we further train the model to produce "person [classNameX] as disneyChar in style of [classNameY]"?

[–]Nitrosocke[S] 1 point2 points  (0 children)

I think it's not yet possible to continue training with the model but you can train two concepts into one from scratch.

[–]jingo6969 1 point2 points  (0 children)

Awesome, thanks for sharing!

[–]MindsAligned 1 point2 points  (1 child)

You sir are a legend . Thank you 😎👌

[–]Nitrosocke[S] 0 points1 point  (0 children)

Thank you! Hope you have fun with it 😁

[–]FS72 1 point2 points  (4 children)

This art style and the Ghibli Studio one brought me so many nostalgic memories. Thank you

[–]Nitrosocke[S] 1 point2 points  (2 children)

So glad you like it. Collecting the dataset for it was the best thing, I heard the songs in my head and that trip down memory lane was great!

[–]zkgkilla 1 point2 points  (1 child)

You should open up commissions for people to request super specific stuff! Would be great for you and give you some incentive to make even more of these amazing models :D

[–]Nitrosocke[S] 0 points1 point  (0 children)

I'm considering that, but I don't know what a good platform would be to host that offer. Patreon doesn't work for one time payments sadly

[–]LockeBlocke 0 points1 point  (0 children)

Merge them together for some interesting results.

[–]Impressive_Key532 1 point2 points  (0 children)

Amazing!!!!

[–]Jolly-Theme-7570 1 point2 points  (0 children)

Another great win. Thanks 👍

[–]Striking-Long-2960 1 point2 points  (0 children)

I just return to tell you that I have used your model, it has worked surprisingly well for X-files and Akira. Now I want to watch a X-files animated show.

https://www.reddit.com/r/StableDiffusion/comments/yhx66c/obscure_classic_animation_projects/

I can say that this model is very good for backgrounds and environments, to the point it sometimes forgives to add the characters.

I mean, this isn't Classic Animation but it is freaking cool

https://imgur.com/a/ZdW0DUf

Thanks again for your work.

[–]thatguitarist[🍰] 1 point2 points  (4 children)

Can we use the merge tool in auto1111 to put all these cool new models together or do you lose quality when we do that?

[–]luchorz93 2 points3 points  (0 children)

came here to ask this

[–]Nitrosocke[S] 0 points1 point  (2 children)

I haven't used the merging feature myself, so I can't tell if it is working good.

[–]thatguitarist[🍰] 1 point2 points  (1 child)

We need a scientist. It would be so cool if it worked alright so we could combine styles like ghibli disney

[–]Nitrosocke[S] 0 points1 point  (0 children)

I think a programmer using the merge script would suffice :) Base SD is already very good with Ghibli style, you can just use the prompt to get that dialed in and maybe use some anime artists as well and you could easily achieve this without any merging

[–]DoctaRoboto 1 point2 points  (0 children)

You are so cool, can I hang with you on Discord? Just kidding, but I feel envious, I wish I could do stuff like this. How many pics do you use for training? All models capture the aesthetics in such an awesome way.

[–]Gfx4Lyf 1 point2 points  (0 children)

Words aren't enough to appreciate your contributions . These models are out of the world mate. SD addiction reinforced😁🙏❤️

[–]lobotomy42 1 point2 points  (0 children)

Oh yes

[–]sync_co 1 point2 points  (1 child)

I would love to see 'Tron' movie model.

[–]kirmm3la 1 point2 points  (0 children)

I’m new to all this and I wondered why can’t this be all merged into the one main model?

[–]GenociderX 1 point2 points  (0 children)

This is very cool, thank you

[–]DoctaRoboto 1 point2 points  (0 children)

If someday you are able to put all your styles on the same ckpt I would gladly pay for it. I know you will not profit from it, just saying.

[–]icefreez 1 point2 points  (1 child)

This is fantastic for taking kid's drawings and improving them slightly!

[–]Nitrosocke[S] 0 points1 point  (0 children)

That's such a nice idea. Would love to see some results!

[–]DeMischi 1 point2 points  (0 children)

Whoa

[–]lozeldatkm 1 point2 points  (1 child)

Is there a concise idiot-proof guide for how to get Dreambooth models working? It took me long enough just to understand how to run any SD API from my PC. Are these Dreambooth models something that add to my existing install or is it a completely new API that runs separately? I seriously know nothing.

[–]Nitrosocke[S] 1 point2 points  (0 children)

Its an alternative model you can use with your local repository like automatics.
There are a few guides on YT to get you started as well.

[–]upvoteshhmupvote 1 point2 points  (1 child)

Is there some kind of format or keyword to use to get this style? Or will using the checkpoint model in a local repo like auto1111 break it? I am having no luck getting anything other than really horrible messes of images. Could someone provide like a prompt or something or advice for this style?

[–]Nitrosocke[S] 0 points1 point  (0 children)

Hi there I included prompt samples on the bottom of the Huggingface page, that should get you started: https://huggingface.co/nitrosocke/classic-anim-diffusion#prompt-and-settings-for-helen-mirren

[–]Producing_It 1 point2 points  (5 children)

I know you personally use the Shivam Shrirao repo, but is there a difference in quality or results if you use Joe Penna’s? Or any major differences that you think are worth mentioning?

Joe Pennas repo is well reviewed and went over online, especially when being used with services like Runpod or vast.ai, as Shivam Shrirao is not as much to my knowledge unfortunately. So this is why I ask.

[–]Nitrosocke[S] 1 point2 points  (4 children)

I haven't used Joe's repo in a while so I can't compare. I just see that my results with Shivam are great and that is able to do what I want it to do. I assume Joe's repo is better documented and more accessible.

[–]Producing_It 2 points3 points  (3 children)

Hey I just wanted to thank you Nitrosocke, you’ve always managed to respond not only to my questions, but the community apart of your posts, fast and with thorough answers. Thanks man, we all really do appreciate it.

Thank you for being as open as possible with your workflow and projects, providing links to your datasets, creating tutorials, releasing your models, and I can go on.

Doing this while even producing more models is just fascinating. Thank you.

[–]Nitrosocke[S] 2 points3 points  (2 children)

This really means a lot to me! Thank you for taking the time to say this and try out my models. It makes me really happy to see them all over social media and the images created with them always bring joy to my heart! On to many more great models!

[–]Producing_It 0 points1 point  (1 child)

How are you able to respond so quickly with what you do man? That’s just crazy! :D

[–]Nitrosocke[S] 0 points1 point  (0 children)

Well training has a lot of passive run time to check reddit :)

[–]martsuia 1 point2 points  (1 child)

God I miss these type of animation.

[–]Nitrosocke[S] 1 point2 points  (0 children)

I hope it sees a comeback some day.

[–]JoeySalmons 0 points1 point  (0 children)

This looks great, but I've been wondering how this would compare to using textual inversion. All of these dreambooth models are being made but they seem like they might be completely unnecessary compared to using textual inversion. Is there a significant benefit to having an entire model trained instead of just an embedding? Also, is one actually better than the other at styles or faces?