What fascinates me about generating images with VQGAN+CLIP is that it CAN generate depth and drama, but only if you know how to ask for them.
"A herd of sheep grazing on a lush green hillside" alone
vs with "amazing awesome and epic" added
ai-weirdness.ghost.io/the-ar…
Because CLIP is trained on internet images and text, it associates the "good" images with certain phrases.
"A herd of sheep grazing on a lush green hillside" before vs after adding "in the style of disney trending on artstation | unreal engine"
Adding "by Bob Ross" to "a herd of sheep grazing on a lush green hillside" did get CLIP+VQGAN to improve the composition dramatically, but gave all the sheep Bob Ross hair.
Jul 2, 2021 · 3:38 PM UTC
Adding "by Tim Burton" to "a herd of sheep grazing on a lush green hillside" got CLIP+VQGAN to generate this very cool looking image. Not sure what happened to the sheep though.
I hate that one of the most effective ways to prompt CLIP+VQGAN to generate a realistic and attractive landscape is to ask for this:
"A herd of sheep grazing on a lush green hillside | dramatic atmospheric ultra high definition free desktop wallpaper"
Using the spammy "A herd of sheep grazing on a lush green hillside | dramatic atmospheric ultra high definition free desktop wallpaper" prompt as a starting point leads CLIP+VQGAN to some irritatingly gorgeous places.
Here, I added "cubist cezanne".
I had VQGAN+CLIP generate "A herd of sheep grazing on a lush green hillside | dramatic atmospheric ultra high definition free desktop wallpaper by lisa frank" and got this absolutely apocalyptic landscape.
I think those slippery purple things may be what's become of the sheep.
This experiment illustrates an interesting aspect of generating stuff with big internet-trained models: it's seen a lot of crummy examples of what you're looking for, and those are just as valid to it as the good ones.
It CAN generate the good stuff. But how do you ask for it?
For more technical details on CLIP+VQGAN and other methods of steering CLIP, plus some gorgeous example images, I recommend this post by @sea_snell
Recently my Twitter timeline has been completely taken over by artwork generated with @OpenAI's CLIP model. So I figured I'd write a blog post about it. In the blog I follow the evolution of this art scene and present some cool artwork along the way ml.berkeley.edu/blog/posts/c…
You can generate CLIP+VQGAN images yourself for free! I used a version by @RiversHaveWings inspired by @advadnoun's Big Sleep notebook.
Tutorial linked here:
Here is a tutorial on how to operate VQGAN+CLIP by Katherine Crowson! No coding knowledge necessary.
I can't guarentee I'll see all the requests we get, so if you want a surefire way to have your image generated, make it yourself! It's super fun.
docs.google.com/document/d/1…
Here's an online @runwayml demo of a much earlier AI called AttnGAN. It tries.
No, you're absolutely right, the sheep are uniformly cursed.
Part of why I chose a herd of grazing sheep is image recognition algorithms have historically had trouble with distinguishing the sheep from the landscape: aiweirdness.com/post/1714519…