AI Navigate

Pixel-level Counterfactual Contrastive Learning for Medical Image Segmentation

arXiv cs.CV / 3/19/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • A new pipeline called DVD-CL and MVD-CL combines counterfactual generation with dense contrastive learning to enable pixel-level medical image segmentation with reduced reliance on manual annotations.
  • The authors introduce CHRO-map, a Color-coded High Resolution Overlay visualization to interpret the segmentation and learned representations.
  • Experiments show annotation-free DVD-CL outperforms other dense contrastive methods, and supervised variants using silver-standard labels outperform training on silver-standard data alone, achieving about 94% DSC on challenging datasets.
  • The approach enhances robustness to acquisition and pathological variations, suggesting improved generalization in medical imaging tasks.
  • The work demonstrates how dense, pixel-level contrastive learning can be extended with counterfactuals and weak supervision to improve segmentation performance.

Abstract

Image segmentation relies on large annotated datasets, which are expensive and slow to produce. Silver-standard (AI-generated) labels are easier to obtain, but they risk introducing bias. Self-supervised learning, needing only images, has become key for pre-training. Recent work combining contrastive learning with counterfactual generation improves representation learning for classification but does not readily extend to pixel-level tasks. We propose a pipeline combining counterfactual generation with dense contrastive learning via Dual-View (DVD-CL) and Multi-View (MVD-CL) methods, along with supervised variants that utilize available silver-standard annotations. A new visualisation algorithm, the Color-coded High Resolution Overlay map (CHRO-map) is also introduced. Experiments show annotation-free DVD-CL outperforms other dense contrastive learning methods, while supervised variants using silver-standard labels outperform training on the silver-standard labeled data directly, achieving \sim94% DSC on challenging data. These results highlight that pixel-level contrastive learning, enhanced by counterfactuals and silver-standard annotations, improves robustness to acquisition and pathological variations.