OpenAI Sora: One Step Away From The Matrix

The best text-to-video AI model is also… a world simulator?

Alberto Romero
7 min readFeb 16, 2024
Source

This article is a selection from The Algorithmic Bridge, an educational project to bridge the gap between AI and people.

Yesterday, OpenAI announced the most important AI model yet in 2024: Sora, a state-of-the-art (SOTA) text-to-video model that can generate high-quality, high-fidelity 1-minute videos with different aspect ratios and resolutions. Calling it SOTA is an understatement; Sora is miles ahead of anything else in the space. It’s general, scalable, and it’s also… a world simulator?

Quick digression: Sorry, Google, Gemini 1.5 was the most important release yesterday — and perhaps of 2024 — but OpenAI didn’t want to give you a single ounce of protagonism (if Jimmy Apples is to be believed, OpenAI had Sora ready since March — what? — which would explain why they manage to be so timely in disrupting competitors’ PR moves). I’ll do a write-up about Gemini 1.5 anyway because although it went under the radar, we shouldn’t ignore a 10M-token context window breakthrough.

Back to Sora. This two-part article is intended for those of you who know nothing about this AI model. It’s also for those of you who watched the cascade of generated videos that flooded the X timeline but didn’t bother to read…

--

--