London-based AI Lab Odyssey has launched a research preview of a model that transforms videos into an interactive world. Initially focusing on world models of film and game production, the Odyssey team stumbled upon a potentially entirely new entertainment medium.
Interactive videos generated by Odyssey’s AI model respond to input in real time. You can interact with a keyboard, phone, controller, or ultimately a voice command. The people at Odyssey are billing it as “early version of Holodeck.”
The underlying AI can generate realistic video frames every 40 milliseconds. That is, when you press a button or make a gesture, the video responds almost instantly. It creates the illusion that it is actually affecting this digital world.
“Today’s experience feels like exploring glitches of dreams. It’s raw, unstable, but definitely new,” Odyssey says. At least we haven’t talked about the visuals of AAA game quality that has been refined yet.
Not standard video technology
Let’s be a little technical. Why is this AI-generated interactive video technology different from standard video games and CGI, for example? It all comes down to what Odyssey calls the “world model.”
Unlike traditional video models that generate an entire clip at once, the world models work frame-by-frame to predict what to come next based on their current state and user input. Large language models are similar to how you predict the next word in sequence, but are endlessly complicated as you are talking about high-resolution video frames rather than words.
“A world model is, at its core, an action conditional dynamics model,” says Odyssey. Each time they interact, the model takes a history of the current state, actions, and what happened, and generates the next video frame accordingly.
The results feel more organic and unpredictable than traditional games. There is no pre-programmed logic saying “If the player does X, Y happens.” AI is guessing what should happen next, based on what they’ve learned from watching countless videos.
Odyssey tackles historic challenges with AI-generated videos
Building something like this is not about walking around the park. One of the biggest hurdles with AI-generated interactive video is to keep it stable over time. Generating each frame based on previous frames quickly complicates small errors (Phenomena AI researchers call it “drifting.”)
To tackle this, Odyssey used what is called a “narrow distribution model.” This essentially pretrains AI with typical video footage and fine-tunes in a smaller environment. This trade-off means lower diversity but greater stability. So it doesn’t all become strange confusion.
The company says it is already “fast-progress” with its next-generation models.
Running all of this flashy AI technology in real time is not cheap. Currently, the infrastructure powered by this experience relies on clusters of H100 GPUs scattered across the US and EU, between 0.80 and 1.60 pounds (1-2) per user hour.
It may sound expensive for streaming videos, but they are very cheap compared to creating content from traditional games and movies. Odyssey also expects these costs to roll further as the model becomes more efficient.
Interactive Video: Next Storytelling Medium?
Throughout history, new technology has created new forms of storytelling, from cave paintings to books, photography, radio, films and video games. Odyssey believes that interactive videos generated by AI are the next step in this evolution.
If they’re right, we might be looking at a prototype of something that will change entertainment, education, advertising, etc. Imagine practicing videos where you can practice the skills you are being taught, or a travel experience where you can explore your sofa destination.
The currently available research preview is clearly a small step towards this vision, and is just a small part of the proof of concept than the finished product. But it’s an interesting glimpse of what it might be possible when the AI-generated world becomes an interactive playground rather than just a passive experience.
You can try out the research preview here.
See also: Telegram and Xai Forge Grok AI Transactions
Want to learn more about AI and big data from industry leaders? Check out the AI & Big Data Expo in Amsterdam, California and London. The comprehensive event will be held in collaboration with other major events, including the Intelligent Automation Conference, Blockx, Digital Transformation Week, and Cyber Security & Cloud Expo.
Check out other upcoming Enterprise Technology events and webinars with TechForge here.