OpenAI is teasing a stunning new text-to-video model called Sora. And early examples of what it can do are blowing up the internet.
Sora is an AI model that creates realistic video from a simple text prompt. But what has everyone talking is the apparent quality of the output: Examples provided by OpenAI show Sora generating videos up to a minute long that appear incredibly realistic and smooth.
Sora isn’t available to the public yet. (Just a select group of “red teamers” testing the model for safety.) But it promises to have a big impact on the world of AI video generation.
So, what’s the big deal with Sora?
I got the answer from Marketing AI Institute founder/CEO Paul Roetzer on Episode 84 of The Artificial Intelligence Show.
“We've been saying for a while now that 2024 was going to be the year of AI for video,” says Roetzer. “That certainly seems to be holding true.”
Sora’s ability to produce 60 seconds of video is an incredible leap over other leading tools like Google Lumiere (5 seconds) and Runway (16 seconds).
It’s also leapfrogging existing tools in other ways.
Sora is clearly interpreting complex prompts with a high degree of accuracy. (Though the real proof will be when we all get access to test it.)
Sora also appears to do a great job at making sure videos “accurately persist” over multiple shots and extended takes, something it is typically very hard for AI video generation tools to do.
Surprisingly, Sora also appears to simulate aspects of the real world it hasn’t explicitly learned—an emergent capability.
“We’re teaching AI to understand and simulate the physical world in motion, with the goal of training models that help people solve problems that require real-world interaction.”
“This isn’t the ChatGPT moment for AI video yet, but it’s a big milestone,” says Roetzer.
While Sora isn’t available yet, OpenAI said they’re giving access to some visual artists, designers, and filmmakers to understand how this will impact creative professionals.
The tools isn’t taking away jobs anytime soon. But OpenAI clearly thinks it could have an impact.
Today, Sora generates 60 seconds of stunning video from scratch, which is impressive enough...
But tomorrow it could generate entire clips or films, bringing massive creativity and disruption to video-related industries.
“Sora serves as a foundation for models that can understand and simulate the real world, a capability we believe will be an important milestone for achieving AGI.”
AGI, or artificial general intelligence, refers to AI that is more intelligent than humans at a broad range of tasks.
“Everything with OpenAI always comes back to AGI,” says Roetzer.