CANVAS: Continuity-Aware Narratives via Visual Agentic Storyboarding

arXiv cs.CL / 4/16/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces CANVAS, a multi-agent framework for long-form visual storytelling that explicitly plans shot-to-shot continuity.
  • CANVAS targets common generative-model failures by enforcing character continuity, using persistent background anchors, and performing location-aware scene planning to reduce abrupt transitions.
  • The authors evaluate CANVAS on existing storyboard benchmarks ST-BENCH and ViStoryBench and propose a new, harder benchmark called HardContinuityBench focused on long-range narrative consistency.
  • Results show CANVAS outperforms the strongest baselines, with reported gains of 21.6% in background continuity, 9.6% in character consistency, and 7.6% in props consistency.

Abstract

Long-form visual storytelling requires maintaining continuity across shots, including consistent characters, stable environments, and smooth scene transitions. While existing generative models can produce strong individual frames, they fail to preserve such continuity, leading to appearance changes, inconsistent backgrounds, and abrupt scene shifts. We introduce CANVAS (Continuity-Aware Narratives via Visual Agentic Storyboarding), a multi-agent framework that explicitly plans visual continuity in multi-shot narratives. CANVAS enforces coherence through character continuity, persistent background anchors, and location-aware scene planning for smooth transitions within the same setting We evaluate CANVAS on two storyboard generation benchmarks ST-BENCH and ViStoryBench and introduce a new challenging benchmark HardContinuityBench for long-range narrative consistency. CANVAS consistently outperforms the best-performing baseline, improving background continuity by 21.6%, character consistency by 9.6% and props consistency by 7.6%.