ME-IQA: Memory-Enhanced Image Quality Assessment via Re-Ranking

arXiv cs.CV / 3/24/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces ME-IQA, a plug-and-play, test-time memory-enhanced re-ranking framework for image quality assessment using reasoning-enabled vision-language models (VLMs).
  • It builds a memory bank by retrieving semantically and perceptually aligned neighbors based on reasoning summaries, enabling more informative comparisons during inference.
  • ME-IQA treats the VLM as a probabilistic comparator to produce pairwise preference probabilities and fuses ordinal evidence with the original scalar score using Thurstone’s Case V model.
  • A gated reflection step and memory consolidation are used to improve future decisions and reduce discrete score collapse, producing denser and distortion-sensitive predictions.
  • Experiments on multiple IQA benchmarks report consistent gains over strong reasoning-induced VLM baselines, non-reasoning IQA methods, and other test-time scaling approaches.

Abstract

Reasoning-induced vision-language models (VLMs) advance image quality assessment (IQA) with textual reasoning, yet their scalar scores often lack sensitivity and collapse to a few values, so-called discrete collapse. We introduce ME-IQA, a plug-and-play, test-time memory-enhanced re-ranking framework. It (i) builds a memory bank and retrieves semantically and perceptually aligned neighbors using reasoning summaries, (ii) reframes the VLM as a probabilistic comparator to obtain pairwise preference probabilities and fuse this ordinal evidence with the initial score under Thurstone's Case V model, and (iii) performs gated reflection and consolidates memory to improve future decisions. This yields denser, distortion-sensitive predictions and mitigates discrete collapse. Experiments across multiple IQA benchmarks show consistent gains over strong reasoning-induced VLM baselines, existing non-reasoning IQA methods, and test-time scaling alternatives.