WorldMark: A Unified Benchmark Suite for Interactive Video World Models
arXiv cs.CV / 4/24/2026
📰 NewsSignals & Early TrendsTools & Practical UsageModels & Research
Key Points
- The paper introduces WorldMark, a unified benchmark suite designed to enable fair cross-model comparisons for interactive image-to-video world models by using standardized scenes, trajectories, and a common control interface.
- It includes a shared action-mapping layer that translates a WASD-style action vocabulary into each model’s native controls, allowing apples-to-apples evaluation across six major models.
- WorldMark provides a hierarchical set of 500 test cases spanning first/third-person views, photorealistic and stylized scenes, and three difficulty tiers (Easy to Hard) with 20–60 second sequences.
- The accompanying modular evaluation toolkit measures Visual Quality, Control Alignment, and World Consistency, and the authors plan to release all data, evaluation code, and outputs; they also launch World Model Arena (warena.ai) for live, side-by-side online battles and a public leaderboard.
Related Articles

Black Hat USA
AI Business

The 67th Attempt: When Your "Knowledge Management" System Becomes a Self-Fulfilling Prophecy of Excellence
Dev.to

Context Engineering for Developers: A Practical Guide (2026)
Dev.to

GPT-5.5 is here. So is DeepSeek V4. And honestly, I am tired of version numbers.
Dev.to
AI Visibility Tracking Exploded in 2026: 6 Tools Every Brand Needs Now
Dev.to