Animov-0.1 — High-resolution anime fine-tune of ModelScope text2video is now available in Auto1111! Trained on 384x384 anime fragments by strangeman3107, makes 2 seconds long videos with only 8.6G of VRAM (16 frames at 8 fps)

Producing_It · 2023-04-17T21:53:10+00:00

See now it’s already starting. Giving the community access to even a mediocre text2vid software will set the foundation for the best synthetic video creation ever.

The precipice of online content has started to transform, for the foreseeable future.

AI DRAWN ANIMATION BABY!!!!

kabachuha · 2023-04-17T20:29:19+00:00

Made by strangeman3107 via https://github.com/ExponentialML/Text-To-Video-Finetuning. The original Diffusers weights https://huggingface.co/datasets/strangeman3107/animov-0.1

The converted weights to use in Auto1111 are here https://huggingface.co/kabachuha/animov-0.1-modelscope-original-format, and the conversion script is also available.

You can find the text2video plugin for sd-webui here https://github.com/deforum-art/sd-webui-text2video

CardAnarchist · 2023-04-17T23:06:47+00:00

This is pretty mind blowing. Much more impressive to me than all the rotoscoping stuff.

Getting awfully tempting to pony up for a 4090 (and a whole new PC) so I can start messing with the fast growing video side of Stable Diffusion.

AsterJ · 2023-04-18T13:11:56+00:00

No Shutterstock logo!

This could become amazing. There's a shit ton of anime to train on.

Enough_Spirit6123 · 2023-04-17T23:37:14+00:00

Ayo .. MAPPA just never dissapoints

Rectangularbox23 · 2023-04-17T23:27:01+00:00

and so it begins…

ExponentialCookie · 2023-04-17T22:19:51+00:00

Amazing!

ninjasaid13 · 2023-04-17T20:35:52+00:00

How much VRAM for training?

Plane_Savings402 · 2023-04-18T01:58:27+00:00

SADDLE UP BOYS! The future has arrived!

Yuli-Ban · 2023-04-18T09:57:37+00:00

Neat proof of concept to know that this is possible. Alas, I can't even begin to figure out how to train/finetune a model— I'm still befuddled by how to even train a LoRA and at this point I'm almost too afraid to ask (hence why I'm waiting for the inevitable agentic AI to do it for me in the future). Looking less for anime and more for a specific cartoon style, but it's all beyond me at this point.

Edit: Wait, nevermind, I figured out how to do a LoRA. Song remains the same, though.

HarmonicDiffusion · 2023-04-18T13:00:45+00:00

and this my friends is why MJ will never be able to hold a candle to Stable Diffusion and its army of volunteers :)

tomakorea · 2023-04-17T22:26:17+00:00

If some drunk guy was abducted by aliens and they wanted to know what is anime, I'm pretty sure it's what they will produce and call it "anime"

WanderingPulsar · 2023-04-18T05:48:28+00:00

Fuck thats so cool. Its not there yet but this shows the direction. One more year and people will be massing ok level animes right and left. Some manga artists would even release their own anime themselves instead of cutting a deal with a studio

SpecialistFruit1 · 2023-04-18T07:21:36+00:00

Unlimited Diffusion Works

VocalBlur · 2023-04-18T10:45:35+00:00

<image>

ImpactFrames-YT · 2023-04-18T23:23:43+00:00

What yes is not perfect, but it has timing, spacing and weight it is just a couple of iterations to full bloom animation production. With how difficult animation is to produce I don't think handmade animation will be made ever again.

Manson_79 · 2023-04-18T10:46:33+00:00

Amazing

Zealousideal_Tip_915 · 2023-04-18T11:52:39+00:00

Physics too 😱

Disastrous-Agency675 · 2023-04-18T12:58:17+00:00

Awesome! I’m still getting the memory error with txt2video but good for y’all I guess…

P0ck3t · 2023-04-17T23:21:32+00:00

How do you add this to the current Modelscope text2video?

SwahReddit · 2023-04-18T02:57:22+00:00

Awesome post u/kabachuha, didn't know we were there yet.

Trying to setup everything, and I'm getting this error when attempting to generate:

Exception occurred: memory_efficient_attention() got an unexpected keyword argument 'scale'

Posting in case someone else got this. Latest commit both for A1111 and the extension.

buckjohnston · 2023-04-18T03:20:10+00:00

I want to train this so bad but it looks soo hard based on the instructions. Any chance we may ever get a plugin in auto1111 to train through GUI someday? Forcing myself to figure this out.

Also I am not sure how many training images or minutes of video to use

SaccharineMelody · 2023-04-18T20:16:17+00:00

/r/StableDiffusion won't agree but this is the one "hypnagogic" era of AI I won't miss. I loved DALL E Mini because of the memes (and static images are a bit different from video), but almost every video generation makes me frustrated (even the "funny" ones like Will Smith eating spaghetti) because I can see the pure utility just down the pipe and it bothers me that it's not good yet. I wish I could be having more fun with this but I just need advanced HD video gen NOW (Veruca Salt pout)

HeralaiasYak · 2023-04-25T16:41:31+00:00

Could you share a ballpark figure of the dataset you need for finetuning? In terms of number of 'clips'/seconds used for training?

StableDiffusion

MODERATORS