LIVE: Leveraging Image Manipulation Priors for Instruction-based Video Editing
arXiv cs.CV / 4/21/2026
📰 NewsSignals & Early TrendsModels & Research
Key Points
- The paper introduces LIVE, a joint training framework that uses large-scale, high-quality image editing data together with video datasets to improve instruction-based video editing despite expensive video annotations.
- To address the static-image vs dynamic-video mismatch, LIVE applies a frame-wise token noise strategy and leverages pretrained video generative models to produce plausible temporal changes.
- It uses dataset cleaning plus an automated data pipeline and a two-stage training approach to gradually “anneal” video-editing abilities.
- The authors build a new evaluation benchmark with 60+ difficult tasks common in image editing but underrepresented in existing video datasets, reporting state-of-the-art results via comparisons and ablations.
- The source code is planned to be publicly released, enabling further research and replication.
Related Articles

Competitive Map: 10 AI Agent Platforms vs AgentHansa
Dev.to

Every time a new model comes out, the old one is obsolete of course
Reddit r/LocalLLaMA

We built it during the NVIDIA DGX Spark Full-Stack AI Hackathon — and it ended up winning 1st place overall 🏆
Dev.to

Stop Losing Progress: Setting Up a Pro Jupyter Workflow in VS Code (No More Colab Timeouts!)
Dev.to

🚀 Major BrowserAct CLI Update
Dev.to