Effect of Input Resolution on Retinal Vessel Segmentation Performance: An Empirical Study Across Five Datasets

arXiv cs.CV / 4/6/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The study shows that resizing fundus images to meet GPU constraints can irreversibly erase thin retinal vessels by turning them into subpixel structures before they reach the segmentation network.
Experiments training a baseline U-Net across five datasets (DRIVE, STARE, CHASE_DB1, HRF, FIVES) with varying downsampling ratios reveal dataset-dependent effects on thin-vessel detection.
For high-resolution datasets (HRF, FIVES), thin-vessel sensitivity improves as images are downsampled toward the encoder’s effective operating range, peaking at processed widths between 256 and 876 pixels.
For lower-to-mid resolution datasets (DRIVE, STARE, CHASE_DB1), thin-vessel sensitivity is best at or near native resolution and degrades with any downsampling.
The authors introduce a width-stratified sensitivity metric and demonstrate that standard Dice scores can remain relatively stable even when thin-vessel performance drops by up to 15.8 percentage points, making Dice alone insufficient for microvascular evaluation.

Abstract

Most deep learning pipelines for retinal vessel segmentation resize fundus images to satisfy GPU memory constraints and enable uniform batch processing. However, the impact of this resizing on thin vessel detection remains underexplored. When high resolution images are downsampled, thin vessels are reduced to subpixel structures, causing irreversible information loss even before the data enters the network. Standard volumetric metrics such as the Dice score do not capture this loss because thick vessel pixels dominate the evaluation. We investigated this effect by training a baseline UNet at multiple downsampling ratios across five fundus datasets (DRIVE, STARE, CHASE_DB1, HRF, and FIVES) with native widths ranging from 565 to 3504 pixels, keeping all other settings fixed. We introduce a width-stratified sensitivity metric that evaluates thin (half-width <3 pixels), medium (3 to 7 pixels), and thick (>7 pixels) vessel detection separately, using native resolution width estimates derived from a Euclidean distance transform. Results show that for high-resolution datasets (HRF, FIVES), thin vessel sensitivity improves monotonically as images are downsampled toward the encoder's effective operating range, peaking at processed widths between 256 and 876 pixels. For low-to-mid resolution datasets (DRIVE, STARE, CHASE_DB1), thin vessel sensitivity is highest at or near native resolution and degrades with any downsampling. Across all five datasets, aggressive downsampling reduced thin vessel sensitivity by up to 15.8 percentage points (DRIVE) while Dice remained relatively stable, confirming that Dice alone is insufficient for evaluating microvascular segmentation.