AI Navigate

MedQ-UNI: Toward Unified Medical Image Quality Assessment and Restoration via Vision-Language Modeling

arXiv cs.CV / 3/20/2026

📰 NewsModels & Research

Key Points

  • MedQ-UNI proposes a unified vision-language model that jointly addresses medical image quality assessment and restoration across multiple imaging modalities and degradation types.
  • It follows an assess-then-restore paradigm using a multimodal autoregressive dual-expert architecture with shared attention, where a quality assessment expert yields structured language descriptions of degradation before restoration.
  • The authors assemble ~50K paired samples across three modalities and five restoration tasks with quality annotations for joint Med-IQA and Med-IR training, plus a 2K-sample evaluation benchmark.
  • Experiments show a single MedQ-UNI model achieves state-of-the-art restoration across all tasks without task-specific adaptation and generates superior degradation descriptions, improving restoration fidelity and interpretability.

Abstract

Existing medical image restoration (Med-IR) methods are typically modality-specific or degradation-specific, failing to generalize across the heterogeneous degradations encountered in clinical practice. We argue this limitation stems from the isolation of Med-IR from medical image quality assessment (Med-IQA), as restoration models without explicit quality understanding struggle to adapt to diverse degradation types across modalities. To address these challenges, we propose MedQ-UNI, a unified vision-language model that follows an assess-then-restore paradigm, explicitly leveraging Med-IQA to guide Med-IR across arbitrary modalities and degradation types. MedQ-UNI adopts a multimodal autoregressive dual-expert architecture with shared attention: a quality assessment expert first identifies degradation issues through structured natural language descriptions, and a restoration expert then conditions on these descriptions to perform targeted image restoration. To support this paradigm, we construct a large-scale dataset of approximately 50K paired samples spanning three imaging modalities and five restoration tasks, each annotated with structured quality descriptions for joint Med-IQA and Med-IR training, along with a 2K-sample benchmark for evaluation. Extensive experiments demonstrate that a single MedQ-UNI model, without any task-specific adaptation, achieves state-of-the-art restoration performance across all tasks while generating superior descriptions, confirming that explicit quality understanding meaningfully improves restoration fidelity and interpretability.