Script-to-Slide Grounding: Grounding Script Sentences to Slide Objects for Automatic Instructional Video Generation
arXiv cs.CV / 3/19/2026
📰 NewsTools & Practical UsageModels & Research
Key Points
- The paper defines Script-to-Slide Grounding (S2SG) as the task of grounding script sentences to their corresponding slide objects to enable automated instructional video generation.
- It proposes Text-S2SG, a method that leverages a large language model (LLM) to ground text objects within slides.
- Experiments report a high F1-score of 0.924, demonstrating strong grounding performance.
- By formalizing the slide-based video editing process as a computable task, the work aims to pave the way for automated educational video creation.
Related Articles

ベテランの若手育成負担を減らせ、PLC制御の「ラダー図」をAIで生成
日経XTECH

Hey dev.to community – sharing my journey with Prompt Builder, Insta Posts, and practical SEO
Dev.to

Why Regex is Not Enough: Building a Deterministic "Sudo" Layer for AI Agents
Dev.to

Perplexity Hub
Dev.to

How to Build Passive Income with AI in 2026: A Developer's Practical Guide
Dev.to