×
Dismiss this pinned window
top 200 commentsshow all 209

[–]smoowke 352 points353 points  (43 children)

It is frustratingly impressive. However, I've noticed with walkcycles I see in the vids, it almost subliminally switches from left to rightfoot when they cross, happens multiple times...

[–]eugene20 149 points150 points  (29 children)

It's crazy when you spot it if you didn't the first time

[–]Spepsium 80 points81 points  (8 children)

Not spotting it the first time kinda shows how hard it is to get ai video right. This is a coherent elephant object and at a high level it makes sense, so it's understandable why a model produces this. But it has no physical laws binding the actions within the video so it falls short and we get weird details.

[–]lobotomy42 16 points17 points  (2 children)

What's bizarre is that is can infer enough to construct an approximation of a physical world model, a model detailed enough to include "shape of an elephant" but not detailed enough to understand "positioning of legs"

[–]GoosePotential2446 2 points3 points  (1 child)

I think it's because leaves and elephants are things which are super visual and can be described and tagged. Also likely not every video they're using as training data is based on real world physics, so it's harder to approximate. I'd be really interested to see if they can supplement this model with some sort of physics engine.

[–]Wild_King4244 0 points1 point  (0 children)

Imagine Sora + Blender 3D.

[–]RationalDialog 2 points3 points  (4 children)

In essence the AI isn't really intelligent and doesn't understand what it actually is generating.

[–]ComprehensiveBoss815 2 points3 points  (1 child)

In essence the AI doesn't care about the same physical constraints as humans.

[–]RationalDialog 0 points1 point  (0 children)

I'm gonna say it didn't learn them and doesn't understand them.

[–]Spepsium 0 points1 point  (0 children)

In essence you will see I didn't say it was intelligent. I said it makes sense the "model" produces this.

[–]Vivarevo 0 points1 point  (0 children)

Its dreamlike

[–]pilgermann 30 points31 points  (16 children)

AI hallucinations are such a trip, because it "understands" aesthetics but not the underlying structures, so creates these illusions that ALMOST pass the sniff test. Really common for there to be a third arm where there should be a shadow, say, and it looks aesthetically coherent.

We really need a word for this phenomenon, as it's almost an art technique unto itself. Like Trompe L'Oeil, but really it's own breed of optical illusion.

[–]MagiMas 6 points7 points  (9 children)

I really do wonder if this is a problem that will fix itself by making models more and more multimodal (so that they can learn from other sources how walk cycles actually work) or if we will need to find completely different architectures to really get rid of AI hallucinations.

[–]snakeproof 13 points14 points  (5 children)

I imagine future AI video generators will have some sort of game engine-esque physics simulator that mocks up a wireframe of movement before using that as a basis for the video generation.

[–]Curious-Thanks3966 6 points7 points  (0 children)

Like some sort of ControlNet 2.0

[–]capybooya 3 points4 points  (3 children)

Someone found that an earlier SORA scene of a car driving was suspiciously similar to a specific track from a driving game. I'm wondering if this is just mimicking some very similar training material, and not being representative of the real world creativity when faced with more complex prompts.

[–]Smidgen90 1 point2 points  (1 child)

ClosedAI doing some video to video behind the scenes would be disappointing but not unexpected at this point.

[–]SodiumChlorideFree 0 points1 point  (0 children)

Could also be overfitting if there weren't enough training examples for that specific situation.

[–]Which-Tomato-8646 0 points1 point  (0 children)

Either way, it’s still useful in combining concepts together into a video even if it’s not entirely unique 

[–]ASpaceOstrich[🍰] 1 point2 points  (0 children)

It'll need to understand things which it can't currently do.

[–]SaabiMeister 1 point2 points  (0 children)

If I'm not mistaken, Sora is similar to CharGPT in that it uses a transformer model. Transformers are impressive at guessing what comes next, but they are not architected to build an internal world model. They're in fact quite impressive at guessing being thet they're purely statistical in nature, but they will never 'understand' what is really going on in a scene. JEPA based models are needed for this, according to LeCun.

[–]RelevantMetaUsername 0 points1 point  (0 children)

I think it's a bit of both. Allowing the model to learn from different sources seems to imply reworking the architecture to be able to assimilate different kinds of data.

[–]slimslider 2 points3 points  (0 children)

To me it's just like dreaming. It's all normal until you really look at the details.

[–]capybooya 5 points6 points  (1 child)

I know that AI does not work like the human mind, from listening to smarter people than me debunk the wildest claims. But seeing this I'm very much reminded of just how dreams seem to make sense in the moment, you're just not able to put your finger on exactly what is wrong...

[–]onpg 1 point2 points  (0 children)

They aren't emulating the human brain, but we are definitely borrowing some tricks we learned from how neural networks work.

[–]Graphesium 1 point2 points  (0 children)

"Almost but not quite" is pretty much how AI will remain for the forseeable future based on current tech. For any industry that requires deterministic results based on an emulated reality, today's AI systems aren't even close to making an impact.

[–]-Harebrained- 0 points1 point  (0 children)

Oh, like that Magritte painting of the lady on horseback in the forest! What's that one called... The Blank Signature maybe call it that, a Blank Signature.

[–]nullvoid_techno 0 points1 point  (0 children)

So just like humans?

[–]D4rkr4in 8 points9 points  (0 children)

this is going to be the "how many fingers" test for AI videos

[–]pablo603 3 points4 points  (0 children)

Damn that's trippy lol

[–]BlakeMW 0 points1 point  (0 children)

I absolutely could not spot it on my phone, but I saw it immediately on my PC.

[–]Corsaer 11 points12 points  (1 child)

I know it's a byproduct and seen as something not working correctly, but I'm always impressed how visually smooth and subliminal these Escheresque transitions are. Like the brain and eyes just want them to work.

[–]ninjasaid13 1 point2 points  (0 children)

but I'm always impressed how visually smooth and subliminal these Escheresque transitions are

I believe this will lead to innovative camera techniques or visual effect that's difficult manually.

[–]organic_bird_posion 3 points4 points  (0 children)

I do wonder how hard it would be to fix that stuff. I'm constantly playing Photoshop Whac-A-Mole with static 2D images, which is pretty straightforward to fix. This might be stillborn because it's just easier to simulate it in Blender.

[–]cleroth 2 points3 points  (0 children)

They specifically mentioned these problems in the first announcement.

[–]ThickPlatypus_69 1 point2 points  (0 children)

Someone in another thread pointed out it looks like an asian elephant in profile, but an african elephant in front view.

[–]RationalDialog 1 point2 points  (0 children)

I really think this is kind of a typical IT project: the last 10% take 90% of the time. Fixing these tiny issues will probably not be easy. Also I wonder how many of the generated videos are utter crap of the release this which still has obvious problems? the walking part was already an issue in all of the initial release videos.

[–]yaosio 0 points1 point  (0 children)

In one of the first batch of videos a woman's legs rotate to switch places. It's funny.

[–]_WhoisMrBilly_ 0 points1 point  (0 children)

Wow! That’s weirdly subtle, exciting, and oddly hypnotic, but yet, can’t be unseen once you see it.

I feel strange about this, but am INCREDIBLY eager for the open beta on this.

[–]FeelsPepegaMan 0 points1 point  (0 children)

It’s so uncanny that I could see that being used in a horror movie in certain scenarios

[–]onpg 0 points1 point  (0 children)

A year ago they didn't even have legs, so... this progress is amazing.

[–]SkillPatient -2 points-1 points  (0 children)

The movement is so bad in sora videos. I guess its good for cheap ads. If people watched ads.

[–]caxco93 121 points122 points  (3 children)

Love when the legs swap position seamlessly

[–]EarthquakeBass 22 points23 points  (2 children)

“It’s a physics simulator”

[–]Captain_Pumpkinhead 13 points14 points  (1 child)

It kinda is. But since it's not "built" to be a physics simulator, it's not surprising to see it mess up physics.

[–]ConfusionSecure487 2 points3 points  (0 children)

It stretches physics, who knows, maybe that is possible? 🤔

[–]EVD27 69 points70 points  (6 children)

Generate "eLEAFant"

[–]Ramdak 11 points12 points  (4 children)

While its not an Onlyfant...

[–]MerrySkulkofFoxes 0 points1 point  (1 child)

OnlyPachs - Elephants Being Naughty. That could potentially be created with enough compute and I wonder if there's a market for it. I'd bet there is.

[–]Smallpaul 0 points1 point  (0 children)

Maybe it is more of a human visual cortex simulator, because it seems to "choose" to violate physics in contexts where humans are less likely to notice.

[–]aldeayeah 0 points1 point  (0 children)

Camouflant

[–]Own_Engineering_5881 26 points27 points  (1 child)

amateur

<image>

[–]vanonym_ 0 points1 point  (0 children)

Cetelem!

[–]Dig-a-tall-Monster 33 points34 points  (4 children)

Everyone pointing out the legs should know that a year ago the legs wouldn't even look like legs, so instead of acting like AI videos are never going to be so good they can't be detected how about we all start from the position of assuming they WILL be so good they can't be detected, and start working on potential solutions for that problem that will actually work.

[–]ThickPlatypus_69 0 points1 point  (3 children)

I think it's a hard limit with the current approach. We need something that actually employs logic and problem solving, and not just generates visual mimicry.

[–]Dig-a-tall-Monster 2 points3 points  (1 child)

Right but that's not really a problem. See, this is one AI system, trained to generate images from text. Think of it like the function of your brain that allows you to visualize a thought, trained in reverse from a literal lifetime of real world data. Then there's the function of your brain that enables logic, the function that enables reasoning, etc. and those are trained on your real world personal experiences as data (which includes anything you see/hear/feel/smell/taste/etc).

And we can train different AI to mimic each one of those systems, and we have done for several. Tuj9hen all we need is an AI trained to unify the input and output and interaction of all those systems into a coherent singular entity. Bing bang boom. Done. We already have the models capable of doing the memory recall, the information learning, the speech interaction, the image generation, the multimedia input analysis, and things like math or pattern recognition. The next thing we need is an intermediate AI that is trained to control other AI models, and uses all of them together to control and self correct for hallucinations,and that'll basically put us right at the final step to true AGI.

[–]i860 4 points5 points  (0 children)

Right but that's not really a problem

lol dude, it's a huge problem

[–]BluePandaCafe94-6 0 points1 point  (0 children)

Exactly. I watched the video and thought the physics were all wrong. The elephant moved too quickly, it was too light on its feet, there was no rebound or ripple with each heaving step, like a real elephant. It felt like the AI generated elephant had no mass, as if the AI doesn't understand that elephants are big and heavy, and have a more ponderous and inertia-conscious movements.

[–]dancho-garces 98 points99 points  (7 children)

My son made it, it’s a good idea

[–]Such_Drink_4621 16 points17 points  (6 children)

I made my son, it was a good idea

[–]staccodaterra101 9 points10 points  (0 children)

your son made me, was not a good idea

[–]Open_Marzipan_455 2 points3 points  (3 children)

*doubt*

[–]Such_Drink_4621 1 point2 points  (2 children)

u doubt i made my son?

[–]Open_Marzipan_455 2 points3 points  (1 child)

Yes, you did not make him, you generated him. You gave your partner a prompt (and forgot about the negative prompt in the process) and let the model do the work for you.

[–]Playful-Possession35 1 point2 points  (0 children)

Good son, my idea made it.

[–]kidelaleronStability Staff 8 points9 points  (1 child)

I wonder how much computing it requires at this stage.

[–]Open_Marzipan_455 1 point2 points  (0 children)

how much? yes.

[–]Striking_Pie_3716 19 points20 points  (19 children)

Its crazy,but the legs.

[–]Spepsium 17 points18 points  (11 children)

Legs in AI video appears to be the new hands in AI photos

[–]Graphesium -1 points0 points  (9 children)

Hold up, when did AI solve hands?

[–]ifixputers 0 points1 point  (8 children)

Adetailer

[–]Graphesium 2 points3 points  (7 children)

Adetailer is just auto-inpainting. It throws more iterations at the problem, not fixing the underlying problem itself.

[–]ifixputers 0 points1 point  (6 children)

Sure seems like it fixes the underlying problem, just not in a way you want, but ok

[–]Graphesium 2 points3 points  (5 children)

SD sucks at hands and you think throwing more SD at it solves the problem? Here's an entire thread of why Adetailer doesn't fix hands.

[–]ifixputers -1 points0 points  (4 children)

Yeah, SD sucks less when you’re generating a single body part (versus an entire human body, background, foreground etc all at one time). Who’d a thunk?

Cool thread, but it works just fine. Cherry pick a Reddit thread, but ignore mountains of evidence of it working flawlessly elsewhere on the internet 😂

[–]Graphesium 2 points3 points  (3 children)

You must have a very low bar of quality for hands. Until SD can generate hands as consistently as it generates generic good looking faces, it hasn't solved anything.

[–]ifixputers 1 point2 points  (2 children)

Well guess what, my faces all come out kinda shitty. Until I use… Adetailer lol.

Maybe you suck at reading documentation, maybe your base model sucks. I don’t know. But it works great for me.

[–]Ramdak 10 points11 points  (6 children)

For the untrained eye, this is just real.

[–]5050Clown 8 points9 points  (5 children)

Untrained on the object permanence of a four-legged animal?

[–]Ramdak 4 points5 points  (3 children)

If I show this to most people they won't notice those things. This animation is coherent enough to trick "untrained" eye. They'll know it's a fake because it's impossible, but make something realistic and they won't notice. It's already happening with images, and has been happening before AI too. Image retouching, video effects and post. Unless you know what to look for you'll buy it.

[–]Smallpaul 0 points1 point  (2 children)

You're just using a word incautiously.

"To the untrained and UNSUSPECTING eye it's real."

But if you ask someone to look closely for errors, they don't have to be "trained".

[–]Ramdak 1 point2 points  (1 child)

My apologies, English isn't my main language.

[–]Smallpaul -1 points0 points  (0 children)

It's a small mistake. Any English speaker could make it.

[–]TranscendentalObject 1 point2 points  (0 children)

This would fool a tremendous amount of people if it were just an elephant, historical object permanence or not.

[–]kwalitykontrol1 20 points21 points  (10 children)

I want to see a very simple video of a person standing and drinking a glass of water. The person holds a glass of water. They lift it. Drink from it. Put the glass back down. When they can do that I will be impressed.

[–]pilgermann 9 points10 points  (2 children)

They're really going hard on the paciderms (mammoth video earlier), which suggests some of the non-elephant subjects aren't looking so hot. Or maybe they just like elephants. Who knows.

[–]eikons 21 points22 points  (0 children)

My guess is that elephant footage is uniquely clean learning material for the model for two reasons; they are more often filmed with high quality, stabilized equipment and good lighting - but more importantly:

Elephants move rather slowly (relative to the camera frame). On 30fps video, a lot of detail goes lost when you film small animals. A combination of motion blur and our brains "filling in the gaps" does a lot of work. But for Sora, that may be a major hurdle.

If you want lots of clean footage of animal movement to display the capabilities of your video model, you can either use slow motion mouse footage which there isn't a lot of, or regular elephant footage which there is a lot of.

[–]ASpaceOstrich[🍰] 1 point2 points  (0 children)

Lot of elephant footage for it to copy from I'm guessing

[–]Fhhk 2 points3 points  (0 children)

I remember way back, like 2 weeks ago when I saw the first Sora video of people walking around and thought, yeah well, they can't do convincing lip sync. That's too hard. Then a day or two later, I saw an AI-generated video that had nearly perfect lip sync matching an AI-generated voice.

It won't take long. Seriously, give it like 6 hours and an AI-generated video of a person drinking a glass of water will pop up in your feed.

[–]Merzant 2 points3 points  (1 child)

Best I can offer is a handsome dude contemplating a glass of indeterminate liquid in slow mo as the camera dollies around him.

[–]-Sibience- 0 points1 point  (1 child)

This video at around 2:07 is as close as i've seen to that right now, https://www.youtube.com/watch?v=QoB-mpWrH20

[–]kwalitykontrol1 -1 points0 points  (0 children)

So close

[–]-Harebrained- 0 points1 point  (0 children)

In a way, I think we were all impressed by those levitating beer bottles with manchildren making vague suckling gestures as the flames grew higher. 🍼🔥

[–]Mottis86 -2 points-1 points  (0 children)

The technology is not quite there yet.

[–]Perfect-Campaign9551 5 points6 points  (2 children)

I'm not sure we should care since they said it pretty much will be released "never"...

[–]TizocWarrior 4 points5 points  (0 children)

This. At this rate, it might never see the light because it's deemed "too dangerous" to fall in the hands of the average internet user.

[–]Sinaxxr 2 points3 points  (0 children)

We should care because even if this company decides not to release it, this technology will exist anyway. Whether it's our own government or others, it will be used. Giving it to the people should be the least of our concerns.

[–]farcaller899 10 points11 points  (4 children)

I’d like to see something in a different style, maybe animated. All Sora videos I’ve seen so far are like someone filmed them with a Sony camcorder from 1992. Contrasty live-action.

[–]_-inside-_ 2 points3 points  (3 children)

There are more demos, some are Pixar like animations, a user posted a YouTube link here on the comments, check it out. Animations are a bit crippled though.

[–]Graphesium -1 points0 points  (1 child)

Animations are crippled probably because the only company OpenAI is afraid of shamelessly stealing borrowing training data from is Big Daddy Disney, who has the resources to sue them into the ground.

[–]farcaller899 -1 points0 points  (0 children)

And there are relatively few animations available for scraping off YouTube, compared to live action video.

[–]farcaller899 -1 points0 points  (0 children)

That’s interesting! Especially since knowing some limitations certainly hints at other likely limitations.

[–]lannead 2 points3 points  (1 child)

Aha - camouflage! this is how Mr Snuffleupagus used to stay hidden

[–]-Harebrained- 0 points1 point  (0 children)

Why do elephants paint their toenails? To hide in 🍒 trees.

[–]PM__YOUR__DREAM 2 points3 points  (0 children)

In a decade AI Scribblenauts is going to be a trip to play.

[–]Tasty-Exchange-5682 1 point2 points  (0 children)

Where did you get that? Can see more?

[–]locob 1 point2 points  (0 children)

Make a walking piñata!

[–]tonyg3d 1 point2 points  (0 children)

That doesn't look anything like an elephant. They're not green for a start.

Back to the drawing board OpenAi. ;)

[–]iamapizza 1 point2 points  (3 children)

Where was this posted?

[–]axord 0 points1 point  (1 child)

Here's a news article about it and other recent SORA vids.

[–]RZ_1911 1 point2 points  (0 children)

Guys - relax . And keep breathing steadily. Just remember hype of Dalle 3 . And how it was downgraded in the end :)

[–]TacoRockapella 1 point2 points  (0 children)

Watch the feet closely. It’s not that flawless or impressive then.

[–]Remote-Ad-8631 1 point2 points  (0 children)

Me eagerly waiting for videos of African kids creating airplanes with bottles

[–]TheProblematicG3nius 1 point2 points  (0 children)

Mamorest

[–]miciy5 1 point2 points  (2 children)

The shadow isn't perfect but it's convincing if you don't look closely

[–]SokkaHaikuBot 0 points1 point  (1 child)

Sokka-Haiku by miciy5:

The shadow isn't

Perfect but it's convincing

If you don't look closely


Remember that one time Sokka accidentally used an extra syllable in that Haiku Battle in Ba Sing Se? That was a Sokka Haiku and you just made one.

[–]miciy5 0 points1 point  (0 children)

First time I summoned you, I think

[–]Ferriken25 4 points5 points  (7 children)

I'm not impressed by a censored tool. I prefer to wait sd3.

[–]reality_comes 4 points5 points  (5 children)

Also censored as far as we know.

[–]indoorhatguy 0 points1 point  (4 children)

Can you elaborate?

[–]vanonym_ 5 points6 points  (3 children)

the training data has been vastly purged from nsfw images. I suggest reading the research paper for more details, section 5.3.1. Data Pre-processing:

Before training at sale, we filter our data for the following categories:

  1. Sexual content: We use NSFW-detection models to filter for explicit content.
  2. Aesthetics: We remove images for which our rating systems predict a low score.
  3. Regurgitation: We use a cluster-based deduplication method to remove perceptual and semantic duplicates from the training data

[–]skztr 2 points3 points  (1 child)

(2) is why DallE is so frustratingly bad.

They arbitrarily decide what images are "more beautiful" and train on those, and the model learns "everything must look like a Kubric circlejerk". Great for press releases, atrocious for actual image generation.

[–]vanonym_ 0 points1 point  (0 children)

Agree, I'm less bothered by the NSFW filter than the aesthetic filter

[–]indoorhatguy 0 points1 point  (0 children)

How does that factor in with the variety of Models and Loras that add NSFW?

Sorry if stupid question.

[–]Meebsie 1 point2 points  (0 children)

Damn just wearing your thirst on your sleeve eh? Lol

[–]Snydenthur 2 points3 points  (1 child)

It's very impressive overall, but the walking itself unfortunately looks like this is just some elephant suit worn by two people.

[–]davewashere 2 points3 points  (0 children)

Doesn't that make more sense than an actual leaf elephant?

[–]gurilagarden 2 points3 points  (0 children)

Straight up. Nothing about anything they release impresses me. When I can generate it myself, based on my prompts, then I'll be impressed. Technology is an industry rife with vaporware and manipulative advertising meant to generate hype falsely and throw off the competetion.

[–]Feisty-Pay-5361 1 point2 points  (0 children)

Walk is kinda flotay, I mean elephants don't glide across so gently. Walks more like a cat. Uncanny.

[–]netgeekmillenium 1 point2 points  (0 children)

Check out the leaf elephant our son from India made.

[–]Captain_Pumpkinhead 1 point2 points  (0 children)

Damn, Palworld 2 looks great!

[–][deleted] 1 point2 points  (1 child)

These videos are just tech demos, and require a tremendous amount of dedicated H100s to compute a single video, taking as much as an hour to generate each one. For practical use for a massive audience OpenAI would not be able to supply the massive amount of compute needed for millions of users. I suspect the publicly released SORA will be a max of 6 second generations and the quality will likely be only slight better than Runway/Pika. Nobody will be able to reach/provide this kind of quality for a massive consumer base without a 50x improvement in GPU/computer chip technology. It will likely be many more years before this kind of video generation quality will be available to a massive audience.

[–]i860 0 points1 point  (0 children)

"Shhhhh! We've got shareholders to dupe!"

[–]MonkeyDRaffy 0 points1 point  (0 children)

Gimme

[–]TheRealMoofoo 0 points1 point  (1 child)

Do you mean to suggest that this isn’t a real eleafant?

[–]_-inside-_ 0 points1 point  (0 children)

This are clearly 2 guys wearing an leaf elephant costume, mechanical turk video generator engine, why do you think it might take hours to generate this? /s

[–]OneOneBun 0 points1 point  (0 children)

I wonder how good it will be recreating prehistoric animals

[–]crypticsage 0 points1 point  (0 children)

Time to generate a full Pokémon battle

[–]Muggaraffin 0 points1 point  (0 children)

What, no it isn’t. That’s Lettuce, my lettuce elephant. How’d it get all the way over there. 

[–]Big_Suggestion986 0 points1 point  (0 children)

If you got it, flaunt it. right?

[–]alonginayellowboat 0 points1 point  (0 children)

Your son's def a genius

[–]BEHEMOTHpp 0 points1 point  (0 children)

My son give cloths to his pet

Very good idea

[–]Very_Loki 0 points1 point  (0 children)

woah that's not a real elephant??

[–]seoul_slave 0 points1 point  (0 children)

The elephant ate leaves too much 🍃

[–]Ai_Nerd_ 0 points1 point  (1 child)

When stable-sora?

[–]RiffyDivine2 0 points1 point  (0 children)

When it can be massively locked down and sold would be my guess. Don't expect this to be free.

[–]IHaveAPotatoUpMyAss 0 points1 point  (0 children)

nope, this is real life, have you never seen an elephant made of leaves

[–]Jules040400 0 points1 point  (0 children)

This is alarmingly good, you have to be paying attention to notice the AI-ness.

If you're just scrolling through videos on your lunch break, you wouldn't give it a second thought

[–]masterace01 0 points1 point  (0 children)

The video plus this specific track... There's something quite cinematographic about this.

It's a vibe, so to speak.

[–]Singlot 0 points1 point  (0 children)

Walks like two guys in an elephant suit.

[–]headbopper96 0 points1 point  (0 children)

No shot it’s AI generated. Duh

[–]Hambeggar 0 points1 point  (0 children)

Besides the leg switching, this looks more real and natural than a lot of top-tier CGI. WTF.

[–]Paradigmind 0 points1 point  (1 child)

It's fake.

Because elephants are not made out of leafs.

[–]umg_bhalla 0 points1 point  (0 children)

damn bro you got a phd or smthing

[–]callmepls 0 points1 point  (0 children)

When will they show us Will Smith eating spaghetti

[–]HughWattmate9001 0 points1 point  (0 children)

I dono if its just because the unrealistic elephant made from leafs but it just looks like an unreal engine video from about 3 years ago that has been streamed at a very low bitrate at like 420p or something. The lighting and shadows look to "video game" like.

[–]estellesecant 0 points1 point  (0 children)

Maybe generating wireframe models with physics constraints followed by video2video would be better? Really speculative though

[–]autumnalaria 0 points1 point  (0 children)

I do feel like a lot more work is going into these than just a text prompt.

[–]jvachez 0 points1 point  (0 children)

Look like "Cetelem". Maybe because you must ask Cetelem a credit to pay Sora.

[–]Purple-Run6652 0 points1 point  (0 children)

It's interesting, but resembles an elephant in a leaf costume rather than realistic movie CGI. A more creative prompt might have enhanced the look they were going for.

[–]Traditional-Bunch-56 0 points1 point  (0 children)

Will it release to public..

[–]admi101 0 points1 point  (0 children)

Noooo, not a single leaf bugging!

[–]dank_mankey 0 points1 point  (0 children)

this technology will be great when all the elephants are extinct

[–]Dense-Orange7130 1 point2 points  (0 children)

Meh, doesn't matter how visually impressive it is, if it isn't a local model it's no good.

[–]S-Markt 0 points1 point  (3 children)

how long need this to render?

[–]smb3d 0 points1 point  (2 children)

One of the developers said in an interview that it was a "reasonable" amount of time. He didn't go into specifics, but I'd think maybe like 15-30 minutes from the way he was talking.

[–]Curious-Thanks3966 0 points1 point  (1 child)

probably on a stack of H100 and not on a 4090

[–]smb3d 0 points1 point  (0 children)

For sure.

[–]BlueNux 0 points1 point  (2 children)

Static images still render 6 or fused fingers on a regular basis, so I'm withholding excitement.

Give me a 5 second video of a person grabbing and eating a sushi with chopsticks. Heck, make him eat cake with a fork even.

We all know AI is better at drawing/animating non-human objects and creatures. It can do some seemingly complicated stuff well, but fail at the most basic subjects.

[–]lobotomy42 4 points5 points  (0 children)

We all know AI is better at drawing/animating non-human objects and creatures.

Alternatively: We all know that people are better at recognizing mistakes in images and videos of people than of animals

[–]nzodd 1 point2 points  (0 children)

make him eat cake

When we get some kind of bloody AI-driven revolution I'm gonna be blaming you buddy.

[–]Ateist 0 points1 point  (0 children)

IMHO, they are going about it wrong - they are generating videos whereas they should be creating individual images that are modifications of the original and interpolations between them.

[–]Erhan24 0 points1 point  (0 children)

All posts must be Stable Diffusion related.

[–]MAXFlRE -3 points-2 points  (0 children)

Shadows are straightforward awful.