Sora, OpenAI’s new text-to-video tool, is causing excitement and fears.

Here’s what we know about it

The maker of ChatGPT is now diving into the world of video created by artificial intelligence (AI). The latest Sora-generated video is a remarkable first-person view (FPV) drone-style sequence that shows “what TED Talks might look like in 40 years,” according to OpenAI. TED Talks worked with OpenAI and the filmmaker Paul Trillo to create the 90-second video, which, don’t forget, was created entirely by AI …


Sora isn’t the first to demonstrate this kind of technology. However, industry analysts point to the high quality of the tool’s videos displayed so far and note that its introduction marks a significant leap for both OpenAI and the future of text-to-video generation overall. Source: EuroNews.next

OpenAI CEO Sam Altman also took to X, the platform formerly known as Twitter, to ask social media users to send in prompt ideas.

He later shared realistically detailed videos that responded to prompts like “two golden retrievers podcasting on top of a mountain” and “a bicycle race on ocean with different animals as athletes riding the bicycles with drone camera view”.

Fred Havemeyer, head of US AI and software research at Macquarie, said Sora’s launch marks a big step forward for the industry.

“Not only can you do longer videos, I understand up to 60 seconds, but also the videos being created look more normal and seem to actually respect physics and the real world more,” Havemeyer said.

“You’re not getting as many ‘uncanny valley’ videos or fragments on the video feeds that look… unnatural”.

Prompt: Historical footage of California during the gold rush.

Prompt: The camera follows behind a white vintage SUV with a black roof rack as it speeds up a steep dirt road surrounded by pine trees on a steep mountain slope, dust kicks up from it’s tires, the sunlight shines on the SUV as it speeds along the dirt road, casting a warm glow over the scene. The dirt road curves gently into the distance, with no other cars or vehicles in sight. The trees on either side of the road are redwoods, with patches of greenery scattered throughout. The car is seen from the rear following the curve with ease, making it seem as if it is on a rugged drive through the rugged terrain. The dirt road itself is surrounded by steep hills and mountains, with a clear blue sky above with wispy clouds.

The model has a deep understanding of language, enabling it to accurately interpret prompts and generate compelling characters that express vibrant emotions. Sora can also create multiple shots within a single generated video that accurately persist characters and visual style.

The current model has weaknesses. It may struggle with accurately simulating the physics of a complex scene, and may not understand specific instances of cause and effect. For example, a person might take a bite out of a cookie, but afterward, the cookie may not have a bite mark.

The model may also confuse spatial details of a prompt, for example, mixing up left and right, and may struggle with precise descriptions of events that take place over time, like following a specific camera trajectory.

Safety

We’ll be taking several important safety steps ahead of making Sora available in OpenAI’s products. We are working with red teamers — domain experts in areas like misinformation, hateful content, and bias — who will be adversarially testing the model.

We’ll be engaging policymakers, educators and artists around the world to understand their concerns and to identify positive use cases for this new technology. Despite extensive research and testing, we cannot predict all of the beneficial ways people will use our technology, nor all the ways people will abuse it. That’s why we believe that learning from real-world use is a critical component of creating and releasing increasingly safe AI systems over time. Source: Sora (openai.com)

I would love to hear from you