DistortBench: Benchmarking Vision Language Models on Image Distortion Identification
arXiv cs.CV / 4/23/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces DistortBench, a diagnostic no-reference benchmark to test how well vision-language models (VLMs) identify image distortion type and severity.
- DistortBench includes 13,500 four-choice questions spanning 27 distortion types, grouped into six perceptual categories and five severity levels, with 25 distortions based on KADID-10k calibrations plus two added rotation distortions.
- The authors evaluate 18 VLMs (17 open-weight models from five families and one proprietary model), finding that even the top model achieves only 61.9% accuracy versus a human majority-vote baseline of 65.7%.
- Analysis shows limited and non-monotonic scaling with model size, performance degradation in most “base–thinking” pairs, and different severity-response behaviors across model families.
- The authors position DistortBench as a tool to measure and improve VLMs’ low-level visual perception capabilities, which remain a key weakness.
- Despite VLMs’ strengths on high-level multimodal tasks, they struggle with low-level distortion perception, highlighting a gap for future model improvement.
Related Articles

The anti-AI crowd is giving “real farmers don’t use tractors” energy, and it’s getting old.
Dev.to

Training ChatGPT on Private Data: A Technical Reference
Dev.to

The Rise of Intelligent Software: How AI is Reshaping Modern Product Development
Dev.to

The Anatomy of a Modern AI Marketing Curriculum in 2026 — What It Covers and Why It Matters
Dev.to
AI as a Fascist Artifact
Dev.to