AI horror gymnastics video with legs and feet flying around, LeCun: Video generation model doesn't understand physics at all
An AI-generated gymnastics video attracted nearly one million viewers, and LeCun and other bigwigs even had a fight over it.
Gymnastics performance, emmmm, why doesn’t it count?
Judging from the watermark in the upper right corner of the video, this video was generated by the Dream Machine (from Luma AI), which was once considered to be the "next generation" of Vincent video .
Everyone couldn't sit still after watching it. The discussion surrounding this was a familiar topic in the field of AI videos: whether AI understands the laws of physics .
LeCun spoke directly:
Video generation models don’t understand basic physics. Let alone the human body.
picture
Pedro Domingos, a professor of computer science at the University of Washington, also shook his head after reading it:
AGI may not be as imminent as some people expect.
picture
Abnormal bird food is outrageous
Since Sora came out, the topic of "whether AI understands the laws of physics" has attracted more and more attention.
The following "night scene of a hermit crab using a light bulb as its shell" generated by Sora is a classic example. The interaction between the waves and the beach is very delicate, and the cilia on the hermit crab's legs are also vivid.
picture
Compared with real photos of similar scenes, the only obvious flaw is that the light bulb should not light up because it has no power.
picture
The same is true of Luma AI's Dream Machine, which recently generated a first-person perspective of an abandoned house with full realism:
picture
Therefore, many people believe that the video generation models of Sora, LUMA, etc. have understood simple physical laws.
However, the video released this time is really too outrageous.
Not only did his legs and feet fly around, he also frequently performed miracles:
picture
Even this difficult mid-air somersault would make Newton angry:
picture
So much so that after watching it, netizens said that there was no need to call it scary, it was more like funny.
picture
It is so abstract that LeCun directly commented that video generation models do not understand physics.
He further explained that Sora or other video generation models have similar problems, and video generation technology will undoubtedly improve over time.
but:
A learning system that truly understands physics will not be generative . Just like birds, mammals, etc. understand physics better than any video generation system. Yet none of them can generate detailed videos.
picture
There is another similar thought:
Even if the AI video generation model evolves well in the future and the quality of the generated videos is "perfect", does that mean it understands physics?
picture
LeCun and others’ opinions immediately aroused doubts from netizens:
Birds and mammals also produce detailed videos, but they do so in a way that their brains cannot visualize them.
picture
However, this rebuttal did not convince LeCun.
picture
In addition, there are many people who hold opposing views.
For example, Lucas Beyer, a researcher at Google's DeepMind/Brain team, pointed out:
It's like showing an image generated by a Dall·E mini from a few years ago and then saying that current image generation methods are doomed to fail.
After all, the images generated by the previous raw image model were like:
picture
As for why the model would generate such an outrageous video?
Some netizens believe that it is due to the lack of gymnastics performance data, while others believe that the blurred processing of body parts makes it impossible for the model to understand the human body structure and thus cannot guarantee the continuity of limb movements.
picture
Video generation is computationally more complex and highly context-dependent, placing greater demands on carefully annotated training data, needs that are currently underserved.
picture
Some time ago, SD 3 crashed and the human body generation effect was also poor. Netizens also discussed this issue. Overly strict data review may have mistakenly deleted some harmless adult images , affecting the model's understanding of the human body structure.
picture
One More Thing
In addition to the gymnastics video generated by Luma AI's Dream Machine, Runway's Gen-3 also...
picture
The same model with three heads and six arms:
picture
The same aerial suspension skills:
picture
Reference links:
[1]https://x.com/ylecun/status/1807497091964449266
[2]https://x.com/giffmana/status/1807511985807908926
[3]https://x.com/EricDai_BioE/status/1807540558216454281
[4]https://x.com/Grady_Booch/status/1807556807982010451