Co-Director: Agentic Generative Video Storytelling
arXiv cs.AI / 4/29/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces Co-Director, a hierarchical multi-agent framework that treats generative video storytelling as a global optimization problem rather than a chain of loosely connected prompting steps.
- To maintain semantic coherence across generated frames and scenes, it combines a multi-armed bandit for global creative-direction exploration with a local multimodal self-refinement loop to reduce identity drift and improve sequence-level consistency.
- The authors report that Co-Director significantly outperforms existing agentic pipeline baselines, addressing issues like semantic drift and cascading failures common in independent, handcrafted prompting.
- For evaluation, they release GenAD-Bench, a 400-scenario dataset featuring fictional products intended for personalized advertising use cases.
- The work claims the approach generalizes beyond the tested settings, aiming to support broader cinematic narrative generation.


