Animating generated face test

Sixhaunt · 2022-11-11T08:44:22+00:00

u/MrBeforeMyTime sent me a good video to use as the driver for the image and we have been discussing it during development so shoutout to him.

The idea behind this is to be able to use a single photo of a person that you generated, and create a number of new photos from new angles and with new expressions so that it can be used to train a model. That way you can consistently generate a specific non-existent person to get around issues of using celebrities for comics and stories.

The process I used here was :

use Thin-Plate-Spline-Motion-Model to animate the base image with a driving video.
upsize the result using video2X
extract the frames and correct the faces using GFPGAN
save the frames and optionally recombine them into a video like I did for the post

I'm going to try it with 4 different driving videos then I'll handpick good frames from all of them to train a new model with.

I have done this all on a google colab so I intend to release it once I've cleaned it up and touched it up more

edit: I'll post my google colab for it but keep in mind I just mashed together the google colabs for the various things that I mentioned above. It's not very optimized but it does the job and it's what I used for this video

https://colab.research.google.com/drive/11pf0SkMIhz-d5Lo-m7XakXrgVHhycWg6?usp=sharing

In the end you'll see the following files in google colab that you can download:

fixed.zip contains the 512x512 frames after being run through GFPGan
frames.zip contains the 512x512 frames before being run through GFPGan
out.mp4 contains the 512x512 video after being run through GFPGan (what you see in my post)
upsized.mp4 contains the 512x512 video before being run through GFPGan

keep in mind that if your clip is long, it can produce a ton of photos so downloading them might take a long time. If you just want the video at the end then that shouldnt be as big of a concern since you can just download the mp4

You can also view individual frames without downloading the entire zip by looking in the "frames" and "fixed" folders

edit2: check out some of the frames I picked out from animating the image: https://www.reddit.com/r/StableDiffusion/comments/ys5xhb/training_a_model_of_a_fictional_person_any_name/

I have 27 total which should be enough to train on.

pierrenay · 2022-11-11T07:41:01+00:00

getting closer to the holy grail dude

sheagryphon83 · 2022-11-11T08:52:24+00:00

Absolutely amazing, it is so smooth and lifelike. I’ve watched the vid several times now trying to find fault in the skin muscles and crows feet. And I can’t find any. Her crows feet appear and disappear as they should as she talks pulling and pushing her skin around… Simply amazing.

Speedwolf89 · 2022-11-11T11:37:06+00:00

Now THIS is what I've been sticking around in this horny teen infested subreddit for.

Pretty-Spot-6346 · 2022-11-11T08:02:03+00:00

i know some awesome guys gonna make it easy for us, thank you

Ooze3d · 2022-11-11T11:34:04+00:00

Amazing results. We’re getting very close to consistent animation and from that point on, the sky is the limit. We’re just a few years apart from actual ai movies.

reddit22sd · 2022-11-11T07:50:43+00:00

These are the posts I come to reddit for, excellent thinking!

superluminary · 2022-11-11T09:41:33+00:00

This is extremely impressive

Tax21996 · 2022-11-11T07:58:08+00:00

damn this one is so smooth

Kaennh · 2022-11-11T14:00:29+00:00

Really cool!

Since I started tampering with SD I've been obsessed with the potential it has to generate new animation workflows. I made a quick video (you can check it out here) using FILM + SD but I also wanted to try TSPMM in the same way you have to improve consistency... I'm pretty sure I will now that you have shared a notebook, so thanks for that!

A few of questions:

- Does the driving video needs to have some specific dimensions (other than 1:1 proportion)?- Have you considered Ebsynth as an alternative to achieve a more painterly look (I'm thinking about something similar to Arcane style... perhaps)? Would it be possible to add it to the notebook? (not asking you to, just asking if it's possible?)

Logseman · 2022-11-11T13:49:20+00:00

This is both awe-inspiring and very scary.

Seventh_Deadly_Bless · 2022-11-11T14:46:42+00:00

95-97% humanlike.

Face muscles change of volume from a frame to the few next. My biggest grief.

Body language hints anxiety/fear. But she also smiles. It's not too paradoxical of a message, but it does bother me.

For the pluses :

Bone structure kept all the way through, pretty proportions of her features. Aligned teeth.

Stable Diffusion is good with surface rendering, which give her a realistic, healthy skin. The saturated, vibrant, painterlier/impressionistic style makes the good pop out and hides the less good.

It's scarily good.

Question : What's the animation workflow ?

I know of an AI animation tool (Antidote ? Not sure of the name.), but it's nowhere near that capable. Especially paired with Stable Diffusion

I imagine you had to animate it manually, at least in part, almost celluloid-era style.

Which would be even more of an achievement.

ko0x · 2022-11-11T10:39:42+00:00

Nice, I tried something like this for a music video for a song of mine roughly 2 years ago, but stopped because colabs is such a horrible unfun workflow. Looks like I can give it another go soon.

allumfunkelnd · 2022-11-11T11:30:24+00:00

This is how our quantum computer AIs will communicate with us in real time in the metaverse of the future. :-D Awesome! Thanks for sharing this and your workflow! The face of this Robo-Girl is stunning.

pbinder · 2022-11-11T10:32:16+00:00

I run SD on my desktop; is it possible to do all this locally and not through google colab?

Maycrofy · 2022-11-11T12:57:33+00:00

I mean, it looks like how animation would move in real life. It's very captivating.

kim_en · 2022-11-11T14:46:00+00:00

tf, I thought this kind of animation will come after next year. Absolutely mind blowing.

Dart_CZ · 2022-11-11T15:14:14+00:00

What is she saying? I cannot recognize the first part. But the last part looks like: "me, please" What are your tips guys?

Unlimitles · 2022-11-11T15:17:10+00:00

one day.....someone is going to use these things to Lure men to their dooms.

it's going to work....

ptitrainvaloin · 2022-11-11T16:16:15+00:00

Great results 😁

Here's a tip I discovered that will surely help you along your journey for the purpose you stated, if you make a custom photo template for training with Textual Inversion, the more photorealism the results of your new template are, the faster (less steps) and less images required (less than what is regulary suggested in the field at the present time) to create your own model(s) and style(s) in even higher quality.

short example of a new photorealism_template.txt (in directory stable-diffusion-webui/textual_inversion_templates) you can create :

(photo highly detailed vivid) ([name]) [filewords]

(shot medium close-up high detail vivid) ([name]) filewords

(photogenic processing hyper detailed) ([name])

Etc... add some more lines to it.

The more variations you add the better as long you test your prompts before adding them to your template to be sure they produce good pretty constant photorealism results.

Goodluck and continue to have fun experimenting!

***Edit, input image(s) must be of high quality, otherwise garbage in -> garbage out

lagosta-alucinada · 2022-11-12T00:08:31+00:00

u/savevideo

InMyFavor · 2022-11-12T04:45:10+00:00

This is genuinely fucking nuts

Throwaway-sum · 2022-11-23T20:28:46+00:00

This is nuts!! This only came out weeks ago? It feels like we are experiencing history in the making.

unrealf8 · 2022-11-11T09:34:12+00:00

Ahh, that’s the major question I had about sd. Can I generate a character that I can consistently continue to generate art with. Love it!

Magikarpeles · 2022-11-11T09:48:09+00:00

Hear me out

HulkHunter · 2022-11-11T10:55:22+00:00

Synthetic Reality becoming real.

martsuia · 2022-11-11T11:32:13+00:00

Looking at this feels like I’m dreaming.

1Neokortex1 · 2022-11-11T07:45:29+00:00

🚀🔥

moahmo88 · 2022-11-11T08:38:22+00:00

Good job!

MonoFauz · 2022-11-11T12:09:38+00:00

The progress with this tech is so fast. Great job!

TraditionLazy7213 · 2022-11-11T07:50:27+00:00

Thanks for sharing, amazing stuff

JCNightcore · 2022-11-11T08:26:16+00:00

This is amazing

nano_peen · 2022-11-11T10:25:55+00:00

Incredible consistency

LeBaux · 2022-11-11T14:21:27+00:00

We are all thinking it.

TrevorxTravesty · 2022-11-11T11:01:05+00:00

This is going to be incredible when we’ll be able to do this with dead actors and see them shine again 😯 I’d love to be able to see some of my favorite people such as Robin Williams or Bruce Lee do stuff again 😞 I would love to make loving tributes to them.

StableDiffusion-ModTeam · 2022-11-11T08:30:32+00:00

[removed]

jonesaid · 2022-11-11T13:55:29+00:00

I was wondering if something similar could be done using Euler a step variation to get different images of the same fictional person. I'm not sure if the face stays the same at different steps though...

omnidistancer · 2022-11-11T14:49:44+00:00

I'm implementing something along the same lines but with different models for the motion transfer and upscaling(could possibly go above 2k if everything works out ok). Very interesting to see your amazing results :)

Do you mind share the driving video or at least some suggestion on how to get something similar? The expressions look amazing!

2022-11-11T14:49:44+00:00

Wow, this was well done.

Zyj · 2022-11-11T15:04:40+00:00

That slight smile...

The_Irish_Rover26 · 2022-11-11T15:19:08+00:00

Very cool.

Silly-Slacker-Person · 2022-11-11T15:26:47+00:00

I wonder if soon it will be possible to animate two characters talking at the same time

vs3a · 2022-11-11T15:59:35+00:00

This remind me of Faestock from Deviantart day.

2022-11-11T16:07:54+00:00

Game changer!

AlbertoUEDev · 2022-11-11T16:44:13+00:00

Ohh I was looking something like this 🤩

BinyaminDelta · 2022-11-11T16:52:07+00:00

This is the future.

LordTuranian · 2022-11-11T17:11:04+00:00

Hopefully these are the kind of graphics we will see in the next Skyrim and Fallout game.

yehiaserag · 2022-11-11T17:48:38+00:00

Respect man, I wish all all the best Even more respect because you are sharing with the community

InfiniteComboReviews · 2022-11-11T18:28:05+00:00

This is awesome, but there is something very....off putting about this. Like this is how I'd expect Skynet to try and infiltrate a human base or something.

Promptmuse · 2022-11-11T18:48:27+00:00

Wow, thanks for sharing your process.

Everyday I’m seeing something new and ground breaking.

purplewhiteblack · 2022-11-11T19:32:56+00:00

5 years from now is going to be crazy

wrnj · 2022-11-11T20:11:07+00:00

One question. How usable is a DreamBooth model created only with training images that are all one kind of closeup portrait with the same background and clothing. I noticed that if I train a model only with face selfies the output generations I get is 1:1 the kind of frames that were in the training data, no variety whatsoever.
Do you add some kind of full body images of the fictional person for the training in DB? Thanks.

widgia · 2022-11-11T20:26:42+00:00

Impressive!

AlbertoUEDev · 2022-11-11T21:35:18+00:00

this is why!

<image>

https://youtu.be/zWcYN58Y9EM

GoldenHolden01 · 2022-11-11T23:04:58+00:00

Holy shittttty

Sixhaunt · 2022-11-12T04:23:46+00:00

When you say you'll "train an algorithm", what's that process actually entail?

LynnSpyre · 2022-11-12T06:09:45+00:00

REALLY nice! I've done similar stuff. What tools did you use to get this? This is super smooth

gtoal · 2022-11-12T09:03:59+00:00

You know the theory that everone has a double... - basically there are not enough faces to go around so that everyone can get a unique one ;-) ... I suspect that a person can be found to match any realistic generated face, so using these to avoid litigation might not be as effective as you hope!

Mystvearn2 · 2022-11-12T12:39:50+00:00

Wow. This is great.

Is there a YouTube video on the step by step process? Also, Is it possible to run this thing locally? I have a 3060 which I think can be of use. Don't really matter about the processing time.

LordTuranian · 2022-11-13T05:29:38+00:00

How did you do this? This is amazing. I want to make something like this too.

midihex · 2022-11-14T01:12:31+00:00

A great use of TPSMM! I'm familiar with it so here's some thinkings for you, the default output video quality of TPS is a bit meh, it's vbr quality=5, so this is what I settled on..

imageio.mimsave(output_video_path, [img_as_ubyte(frame) for frame in predictions], codec='libx264rgb', pixelformat='rgb24', output_params=['-crf', '0', '-s', '256x256', '-preset', 'veryslow'], fps=fps)

Which is x264 lossless

Also not sure that a pre-upscale before GFPGan is needed for this usage, GFPgan upscales anywhere up to 8x and then applies the face restore, it can also use realesrgan for the bits that GFPgan doesn't touch.

Saw someone mention codeformer - it's great for static but falls apart with video, can't keep coherency like GFPgan

Illustrious_Row_9971 on Reddit wrote a gradio colab version of TPS that you drag and drop on to, haven't got the link atm but it'll show with a search I think.

Final output I always had to lossless (HUFFYUV or FFV1) retains so much more detail than mp4

Automatic-Respect-23 · 2022-11-16T12:03:39+00:00

Great job!

can you please share the driving video?

edit:

sometimes my photo doesn't fit to the driving video, and the results are very poor to take for training. do you have any suggestions?

Thanks a lot!

StableDiffusion

MODERATORS

View link