Causal Disentanglement for Full-Reference Image Quality Assessment

arXiv cs.CV / 4/24/2026

📰 NewsModels & Research

Key Points

  • The paper introduces a new full-reference image quality assessment (FR-IQA) paradigm that uses causal inference and decoupled representation learning rather than the common pairwise feature comparison approach.
  • It disentangles degradation and content by leveraging content invariance between reference and distorted images, and uses a masking-inspired module to extract degradation features causally influenced by content.
  • Quality prediction is performed from the resulting degradation representations via supervised regression or label-free dimensionality reduction, supporting multiple training regimes.
  • Experiments show competitive results on standard IQA benchmarks under fully supervised, few-label, and label-free settings, and the method improves cross-domain generalization on scarce-data non-standard image types (e.g., underwater, medical, radiographic, neutron, and screen-content images).
  • The authors emphasize scenario-specific training and prediction without labeled IQA data, aiming to outperform existing training-free FR-IQA approaches in cross-domain scenarios.

Abstract

Existing deep network-based full-reference image quality assessment (FR-IQA) models typically work by performing pairwise comparisons of deep features from the reference and distorted images. In this paper, we approach this problem from a different perspective and propose a novel FR-IQA paradigm based on causal inference and decoupled representation learning. Unlike typical feature comparison-based FR-IQA models, our approach formulates degradation estimation as a causal disentanglement process guided by intervention on latent representations. We first decouple degradation and content representations by exploiting the content invariance between the reference and distorted images. Second, inspired by the human visual masking effect, we design a masking module to model the causal relationship between image content and degradation features, thereby extracting content-influenced degradation features from distorted images. Finally, quality scores are predicted from these degradation features using either supervised regression or label-free dimensionality reduction. Extensive experiments demonstrate that our method achieves highly competitive performance on standard IQA benchmarks across fully supervised, few-label, and label-free settings. Furthermore, we evaluate the approach on diverse non-standard natural image domains with scarce data, including underwater, radiographic, medical, neutron, and screen-content images. Benefiting from its ability to perform scenario-specific training and prediction without labeled IQA data, our method exhibits superior cross-domain generalization compared to existing training-free FR-IQA models.