Exploring Prompt Alignment with Clinical Factors in Zero-Shot Segmentation VLMs for NSCLC Tumor Segmentation
arXiv cs.CV / 5/5/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The study investigates which prompt dimensions most strongly control the spatial behavior of a zero-shot vision-language model (VoxTell) for NSCLC gross tumor volume segmentation.
- Through sub-prompt decomposition, perturbation robustness tests, specificity ladders, and cross-case prompt swaps, the authors find anatomical location is the dominant alignment driver, with location changes often causing catastrophic segmentation failures.
- Irrelevant prompts reliably lead to zero segmentation, while increasing prompt specificity generally improves performance (with diagnosis-only prompts behaving differently).
- Cross-case prompt swaps show patient-specific conditioning, with matched cases achieving much higher Dice scores than mismatched ones, suggesting the model encodes case-specific spatial context.
- VoxTell’s fully zero-shot mean Dice score is statistically indistinguishable from nnUNet, while outperforming other zero-shot baselines, and the paper argues evaluation should include prompt-dimension alignment in addition to Dice.
Related Articles
Singapore's Fraud Frontier: Why AI Scam Detection Demands Regulatory Precision
Dev.to
How AI is Changing the Way We Code in 2026: The Shift from Syntax to Strategy
Dev.to
13 CLAUDE.md Rules That Make AI Write Modern PHP (Not PHP 5 Resurrected)
Dev.to
MCP annotations are a UX layer, not a security layer
Dev.to
From OOM to 262K Context: Running Qwen3-Coder 30B Locally on 8GB VRAM
Dev.to