Abstract
Carotid intima-media thickness (CIMT) measured from B-mode ultrasound is an established vascular biomarker for atherosclerosis and cardiovascular risk stratification. Although a wide range of computerized methods have been proposed for carotid boundary delineation and CIMT estimation, robust and transferable deep models that jointly address segmentation and measurement remain underexplored, particularly in the era of vision foundation models. Motivated by recent advances in adapting DINOv3 to medical segmentation and exploiting DINOv3 in test-time optimization pipelines, we investigate a DINOv3-based framework for carotid intima-media complex segmentation and subsequent CIMT measurement on the Carotid Ultrasound Boundary Study (CUBS) v1 dataset. Our pipeline predicts the intima-media band at a fixed image resolution, extracts upper and lower boundaries column-wise, corrects for image resizing using the per-image calibration factor provided by CUBS, and reports CIMT in physical units. Across three patient-level test splits, our method achieved a mean test Dice of 0.7739 \pm 0.0037 and IoU of 0.6384 \pm 0.0044. The mean CIMT absolute error was 181.16 \pm 11.57 \mum, with a mean Pearson correlation of 0.480 \pm 0.259. In a held-out validation subset (n=28), test-time threshold calibration reduced the mean absolute CIMT error from 141.0 \mum at the default threshold to 101.1 \mum at the measurement-optimized threshold, while simultaneously reducing systematic bias toward zero. Relative to the error ranges reported in the original CUBS benchmark for classical computerized methods, these results place a DINOv3-based approach within the clinically relevant \sim0.1 mm measurement regime. Together, our findings support the feasibility of using vision foundation models for interpretable, calibration-aware CIMT measurement.