Learning to Segment using Summary Statistics and Weak Supervision

arXiv cs.CV / 5/6/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes training image segmentation models when only limited summary statistics from medical annotations are available (e.g., annotated region area), rather than full pixel-wise labels.
  • It finds that summary statistics alone are insufficient for accurate segmentation, but performance improves markedly when weak supervision is added via a few pixels within the target region.
  • The method introduces a new loss function that jointly optimizes image reconstruction quality, agreement with the provided summary statistics, and overlap with the weak pixel-level supervisory signal.
  • Experiments across natural images, ultrasound breast cancer data, and CT kidney tumor scans show the approach can reduce reliance on expensive manual pixel annotations while still achieving useful segmentation results.

Abstract

Medical experts often manually segment images to obtain diagnostic statistics and discard the resulting annotations. We aim to train segmentation models to alleviate this burden, but constrained to the retained summary statistics (e.g., the area of the annotated region). Empirical results suggest that statistics alone are insufficient for this task, but adding weak information in the form of a few pixels within the area of interest significantly improves performance. We use a novel loss function that combines terms for image reconstruction quality, matching to summary statistics, and overlap between the predicted foreground and the weak supervisory signal. Experiments on standard image, ultrasound (breast cancer), and Computed Tomography (CT) scan (kidney tumors) data demonstrate the utility and potential of the approach.