From Codebooks to VLMs: Evaluating Automated Visual Discourse Analysis for Climate Change on Social Media
arXiv cs.CV / 4/24/2026
💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsModels & Research
Key Points
- The paper proposes a framework for using computer vision and vision-language models to analyze climate-change discourse on social media images at scale.
- It benchmarks six promptable VLMs and 15 zero-shot CLIP-like models on two X (Twitter) datasets, covering five annotation dimensions such as climate actions, consequences, and image context.
- Gemini-3.1-flash-lite achieves the best overall performance across categories and both datasets, with relatively small performance gaps versus some moderately sized open-weight models.
- The authors argue that distribution-level evaluation can recover population trends even when per-image accuracy is only moderate, enabling scalable discourse analysis.
- They report that chain-of-thought prompting hurts performance, while prompt designs tailored to specific annotation dimensions improve results, and they release tweet IDs/labels and code for reproducibility.
Related Articles

Context Engineering for Developers: A Practical Guide (2026)
Dev.to

GPT-5.5 is here. So is DeepSeek V4. And honestly, I am tired of version numbers.
Dev.to
AI Visibility Tracking Exploded in 2026: 6 Tools Every Brand Needs Now
Dev.to

I Built an AI Image Workflow with GPT Image 2.0 (+ Fixing Its Biggest Flaw)
Dev.to
Max-and-Omnis/Nemotron-3-Super-64B-A12B-Math-REAP-GGUF
Reddit r/LocalLLaMA