OmniWeaving: Towards Unified Video Generation with Free-form Composition and Reasoning

arXiv cs.CV / 3/26/2026

📰 NewsSignals & Early TrendsModels & Research

共有:

Key Points

The paper proposes OmniWeaving, an open omni-level video generation model aimed at unifying multiple tasks via free-form multimodal composition and reasoning over inputs like text, multi-image, and video.
OmniWeaving is trained on a large-scale pretraining dataset designed to include compositional and reasoning-augmented scenarios, enabling the model to temporally bind interleaved multimodal signals into coherent video outputs.
The authors position the model as an “intelligent agent” that infers complex user intentions to support more sophisticated video creation workflows.
They introduce IntelligentVBench, a new benchmark intended to rigorously evaluate next-level intelligent unified video generation performance.
Experiments claim state-of-the-art results among open-source unified video generation models, with code and model planned for public release soon.

Abstract

While proprietary systems such as Seedance-2.0 have achieved remarkable success in omni-capable video generation, open-source alternatives significantly lag behind. Most academic models remain heavily fragmented, and the few existing efforts toward unified video generation still struggle to seamlessly integrate diverse tasks within a single framework. To bridge this gap, we propose OmniWeaving, an omni-level video generation model featuring powerful multimodal composition and reasoning-informed capabilities. By leveraging a massive-scale pretraining dataset that encompasses diverse compositional and reasoning-augmented scenarios, OmniWeaving learns to temporally bind interleaved text, multi-image, and video inputs while acting as an intelligent agent to infer complex user intentions for sophisticated video creation. Furthermore, we introduce IntelligentVBench, the first comprehensive benchmark designed to rigorously assess next-level intelligent unified video generation. Extensive experiments demonstrate that OmniWeaving achieves SoTA performance among open-source unified models. The codes and model will be made publicly available soon. Project Page: https://omniweaving.github.io.

Regulating Prompt Markets: Securities Law, Intellectual Property, and the Trading of Prompt Assets

Dev.to

Mercor competitor Deccan AI raises $25M, sources experts from India

Dev.to

I asked my AI agent to design a product launch image. Here's what came back.

Dev.to

They Did Not Accidentally Make Work the Answer to Who You Are

Dev.to

Welsh government used Copilot for review to justify closing organization

The Register

OmniWeaving: Towards Unified Video Generation with Free-form Composition and Reasoning

Key Points

Abstract

Related Articles

Regulating Prompt Markets: Securities Law, Intellectual Property, and the Trading of Prompt Assets

Mercor competitor Deccan AI raises $25M, sources experts from India

I asked my AI agent to design a product launch image. Here's what came back.

They Did Not Accidentally Make Work the Answer to Who You Are

Welsh government used Copilot for review to justify closing organization

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer