A Multi-Agent Feedback System for Detecting and Describing News Events in Satellite Imagery
arXiv cs.CV / 4/15/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper argues that while bi-temporal change captioning exists, there is a lack of multi-temporal satellite event captioning datasets that use at least two images per sequence, largely due to search and labeling costs.
- It introduces SkyScraper, an iterative multi-agent workflow that geocodes news articles and then synthesizes captions for matching multi-temporal satellite imagery.
- Experiments indicate SkyScraper can find about 5× more events than traditional geocoding methods, suggesting that agentic feedback helps surface relevant new events.
- The authors apply the system to a large corpus of global news and curate a new dataset with 5,000 multi-temporal captioning sequences.
- The work positions automated imagery-event linkage and captioning as a support tool for journalism and reporting by identifying relevant satellite evidence for news events.
Related Articles

Black Hat Asia
AI Business

The Complete Guide to Better Meeting Productivity with AI Note-Taking
Dev.to

5 Ways Real-Time AI Can Boost Your Sales Call Performance
Dev.to

RAG in Practice — Part 4: Chunking, Retrieval, and the Decisions That Break RAG
Dev.to
Why dynamically routing multi-timescale advantages in PPO causes policy collapse (and a simple decoupled fix) [R]
Reddit r/MachineLearning