Exploring the Impact of Skin Color on Skin Lesion Segmentation

arXiv cs.CV / 4/1/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper studies whether skin tone affects AI-based skin lesion segmentation, an important preprocessing step for downstream skin cancer analysis.
  • It evaluates three segmentation architectures (UNet, DeepLabV3+ResNet50, and DINOv2) on HAM10000 and ISIC2017 and tests fairness using both discrete skin-tone groupings and continuous pixel-wise ITA (International Typology Angle) distributions.
  • Using Wasserstein distances over within-image distributions for skin, lesion, and whole images, the authors quantify lesion-skin contrast and relate it to segmentation error across multiple metrics.
  • Global skin tone metrics (e.g., Fitzpatrick grouping or mean ITA) show only weak correlation with segmentation quality within the dataset ranges.
  • The models’ largest segmentation errors are consistently linked to low lesion-skin contrast, suggesting boundary ambiguity is a primary failure mode and that contrast-aware, distribution-based audit signals are more informative than discrete skin-tone categories.

Abstract

Skin cancer, particularly melanoma, remains a major cause of morbidity and mortality, making early detection critical. AI-driven dermatology systems often rely on skin lesion segmentation as a preprocessing step to delineate the lesion from surrounding skin and support downstream analysis. While fairness concerns regarding skin tone have been widely studied for lesion classification, the influence of skin tone on the segmentation stage remains under-quantified and is frequently assessed using coarse, discrete skin tone categories. In this work, we evaluate three strong segmentation architectures (UNet, DeepLabV3 with a ResNet50 backbone, and DINOv2) on two public dermoscopic datasets (HAM10000 and ISIC2017) and introduce a continuous pigment or contrast analysis that treats pixel-wise ITA values as distributions. Using Wasserstein distances between within-image distributions for skin-only, lesion-only, and whole-image regions, we quantify lesion skin contrast and relate it to segmentation performance across multiple metrics. Within the range represented in these datasets, global skin tone metrics (Fitzpatrick grouping or mean ITA) show weak association with segmentation quality. In contrast, low lesion-skin contrast is consistently associated with larger segmentation errors in models, indicating that boundary ambiguity and low contrast are key drivers of failure. These findings suggest that fairness improvements in dermoscopic segmentation should prioritize robust handling of low-contrast lesions, and the distribution-based pigment measures provide a more informative audit signal than discrete skin-tone categories.