DATAREEL: Automated Data-Driven Video Story Generation with Animations

arXiv cs.AI / 4/29/2026

📰 NewsSignals & Early TrendsModels & Research

Key Points

  • The paper introduces DataReel, a new benchmark (328 real-world stories) for evaluating automated data-driven video story generation that combines structured data, chart visuals, and narration transcripts.
  • It addresses a key gap in the field: the lack of rigorous benchmarks to assess models on animated, chart-centric visualization-based video storytelling.
  • The authors propose a multi-agent framework that breaks generation into planning, generation, and verification stages to better mirror human storytelling workflows.
  • Experiments indicate the multi-agent approach outperforms direct prompting baselines across both automatic and human evaluations, but it still struggles with coordinating animation, narration, and visual emphasis.
  • The benchmark and framework are released publicly on GitHub for use by researchers and developers.

Abstract

Data videos are a powerful medium for visual data based storytelling, combining animated, chart-centric visualizations with synchronized narration. Widely used in journalism, education, and public communication, they help audiences understand complex data through clear and engaging visual explanations. Despite their growing impact, generating data-driven video stories remains challenging, as it requires careful coordination of visual encoding, temporal progression, and narration and substantial expertise in visualization design, animation, and video-editing tools. Recent advances in large language models offer new opportunities to automate this process; however, there is currently no benchmark for rigorously evaluating models on animated visualization-based video storytelling. To address this gap, we introduce DataReel, a benchmark for automated data-driven video story generation comprising 328 real-world stories. Each story pairs structured data, a chart visualization, and a narration transcript, enabling systematic evaluation of models' abilities to generate animated data video stories. We further propose a multi-agent framework that decomposes the task into planning, generation, and verification stages, mirroring key aspects of the human storytelling process. Experiments show that this multi-agent approach outperforms direct prompting baselines under both automatic and human evaluations, while revealing persistent challenges in coordinating animation, narration, and visual emphasis. We release DataReel at https://github.com/vis-nlp/DataReel.