Co-Director: Agentic Generative Video Storytelling

arXiv cs.AI / 4/29/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces Co-Director, a hierarchical multi-agent framework that treats generative video storytelling as a global optimization problem rather than a chain of loosely connected prompting steps.
  • To maintain semantic coherence across generated frames and scenes, it combines a multi-armed bandit for global creative-direction exploration with a local multimodal self-refinement loop to reduce identity drift and improve sequence-level consistency.
  • The authors report that Co-Director significantly outperforms existing agentic pipeline baselines, addressing issues like semantic drift and cascading failures common in independent, handcrafted prompting.
  • For evaluation, they release GenAD-Bench, a 400-scenario dataset featuring fictional products intended for personalized advertising use cases.
  • The work claims the approach generalizes beyond the tested settings, aiming to support broader cinematic narrative generation.

Abstract

While diffusion models generate high-fidelity video clips, transforming them into coherent storytelling engines remains challenging. Current agentic pipelines automate this via chained modules but suffer from semantic drift and cascading failures due to independent, handcrafted prompting. We present Co-Director, a hierarchical multi-agent framework formalizing video storytelling as a global optimization problem. To ensure semantic coherence, we introduce hierarchical parameterization: a multi-armed bandit globally identifies promising creative directions, while a local multimodal self-refinement loop mitigates identity drift and ensures sequence-level consistency. This balances the exploration of novel narrative strategies with the exploitation of effective creative configurations. For evaluation, we introduce GenAD-Bench, a 400-scenario dataset of fictional products for personalized advertising. Experiments demonstrate that Co-Director significantly outperforms state-of-the-art baselines, offering a principled approach that seamlessly generalizes to broader cinematic narratives. Project Page: https://co-director-agent.github.io/