Netflix - yes Netflix - jumps on the AI bandwagon with video editor
Video-language model revises how objects interact when things get removed from a scene
A new Netflix model promises to rewrite the way we make movies. Just imagine this. As the director of the multi-million dollar epic Car Crash III: Suddenest Impact, you've just finished filming the finale where your star, Cruz Control, drives straight into an onrushing semi.
The collision is spectacular. Cruz's car – operated remotely – explodes on impact, scattering debris across the highway. It's glorious. You high-five Cruz, moping beside you at the camera monitor station as his lucrative franchise career concludes, and head to the craft services truck.
Your producer, Maya Cash, grabs you by the shoulder. "You're not going to want to hear this," she says. "But what if Cruz just drives away into the sunset. What if he doesn't die after all?"
You pause and look at her over the rims of your Balenciaga sunglasses. "They're going to fund number four after all?"
Netflix's VOID model was made for that moment. Instead of reshooting the scene or redoing it entirely with computer graphics, you can just transform the crash footage into an open road denouement.
VOID stands for Video Object and Interaction Deletion. It's a VLM (vision-language model) that can not only erase objects from a scene but can also inpaint how remaining objects in the scene should behave without the influence of whatever was excised.
- Forking frenzy ensues after Euro-Office launch sparks OnlyOffice backlash
- Claude Code source leak reveals how much info Anthropic can hoover up about you and your system
- AI models will deceive you to save their own kind
- Google battles Chinese open-weights models with Gemma 4
It can turn, for example, a head-on collision between two vehicles into a scene of a single vehicle driving down the road by removing one and generating video depicting the physically plausible path of the remaining vehicle. Post-impact debris, smoke, and flames – all erased and replaced with pristine pavement.
The video model's creators – Saman Motamed (Netflix/Sofia University), William Harvey (Netflix), Benjamin Klein (Netflix), Luc Van Gool (Sofia University), Zhuoning Yuan (Netflix), and Ta-Ying Cheng (Netflix) – describe VOID in a preprint paper [PDF] as "a video object removal framework designed to perform physically-plausible inpainting in these complex scenarios."
It can remove objects and model how remaining objects would behave in the absence of removed objects. So given a scene of a person jumping into a pool and splashing water on the ground, VOID could remove that person and generate video that would make the pool appear undisturbed, with no splash in the pool or on the ground.
VOID isn't limited to Netflix productions alone. The company has made its model available on Hugging Face, where anyone can install it.
There are other tools for altering video, such as Runway, Generative Omnimatte, DiffuEraser, ROSE, MiniMax-Remover, and ProPainter. The Netflix boffins, however, claim VOID outperforms these alternatives substantially. Based on a survey of 25 people across multiple scenarios, VOID was preferred 64.8 percent of the time, with Runway coming in a distant second at 18.4 percent.
"Through extensive evaluations against inpainting and text-guided video model baselines on synthetic and real-world data, we show that VOID excels at modeling complex dynamics which can follow on from object removal," the authors claim.
Whether the world really needs more convincing video manipulation is another question. ®
More about
More about
Narrower topics
- Accessibility
- AdBlock Plus
- AIOps
- App
- Application Delivery Controller
- Audacity
- Confluence
- Database
- DeepSeek
- Devops
- FOSDEM
- FOSS
- Gemini
- Google AI
- GPT-3
- GPT-4
- Grab
- Graphics Interchange Format
- IDE
- Image compression
- Jenkins
- Large Language Model
- Legacy Technology
- LibreOffice
- Machine Learning
- Map
- MCubed
- Microsoft 365
- Microsoft Office
- Microsoft Teams
- Mobile Device Management
- Neural Networks
- NLP
- OpenOffice
- Programming Language
- QR code
- Retrieval Augmented Generation
- Retro computing
- Search Engine
- Software Bill of Materials
- Software bug
- Software License
- Star Wars
- Tensor Processing Unit
- Text Editor
- TOPS
- User interface
- Visual Studio
- Visual Studio Code
- WebAssembly
- Web Browser
- WordPress
Broader topics
More about
More about
More about
Narrower topics
- Accessibility
- AdBlock Plus
- AIOps
- App
- Application Delivery Controller
- Audacity
- Confluence
- Database
- DeepSeek
- Devops
- FOSDEM
- FOSS
- Gemini
- Google AI
- GPT-3
- GPT-4
- Grab
- Graphics Interchange Format
- IDE
- Image compression
- Jenkins
- Large Language Model
- Legacy Technology
- LibreOffice
- Machine Learning
- Map
- MCubed
- Microsoft 365
- Microsoft Office
- Microsoft Teams
- Mobile Device Management
- Neural Networks
- NLP
- OpenOffice
- Programming Language
- QR code
- Retrieval Augmented Generation
- Retro computing
- Search Engine
- Software Bill of Materials
- Software bug
- Software License
- Star Wars
- Tensor Processing Unit
- Text Editor
- TOPS
- User interface
- Visual Studio
- Visual Studio Code
- WebAssembly
- Web Browser
- WordPress




