PaveBench: A Versatile Benchmark for Pavement Distress Perception and Interactive Vision-Language Analysis
arXiv cs.CV / 4/6/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces PaveBench, a large-scale benchmark aimed at advancing pavement distress perception beyond standard CV tasks by including quantitative analysis and explanation needs.
- PaveBench unifies four tasks—classification, object detection, semantic segmentation, and vision-language question answering—with standardized task definitions and evaluation protocols.
- It provides real-world highway inspection images with extensive visual annotations, plus a curated hard-distractor subset to assess robustness against confusing cases.
- The work also proposes PaveVQA, a real-image vision-language QA dataset that supports single-turn and multi-turn, expert-corrected interactions for recognition, localization, quantitative estimation, and maintenance reasoning.
- The authors evaluate state-of-the-art approaches and present a simple agent-augmented VQA framework that uses domain-specific models as tools alongside vision-language models, with the datasets released on Hugging Face.
Related Articles

Black Hat Asia
AI Business

How Bash Command Safety Analysis Works in AI Systems
Dev.to

How I Built an AI Agent That Earns USDC While I Sleep — A Complete Guide
Dev.to

How to Get Better Output from AI Tools (Without Burning Time and Tokens)
Dev.to

How I Added LangChain4j Without Letting It Take Over My Spring Boot App
Dev.to