CTSCAN: Evaluation Leakage in Chest CT Segmentation and a Reproducible Patient-Disjoint Benchmark
arXiv cs.CV / 4/20/2026
📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsModels & Research
Key Points
- The paper argues that reported chest CT segmentation results are often inflated because train and test splits accidentally share slices from the same patient study.
- It introduces CTSCAN, a reproducible multi-source benchmark and research stack that specifically evaluates models under patient-disjoint (case-disjoint) conditions.
- Using the same FPN + EfficientNet-B0 baseline across a multi-seed sweep, the study shows large performance drops when switching from slice-mixed to case-disjoint evaluation (foreground Dice: 0.6665 → 0.2066; foreground IoU: 0.5031 → 0.1181).
- The authors quantify the impact of eliminating patient reuse as a 0.4599 absolute (69% relative) decrease in foreground Dice and a 0.3850 absolute (76.52% relative) decrease in foreground IoU.
- CTSCAN includes deterministic split manifests, weak-supervision controls, scripted multi-seed protocol sweeps, and reproducible figure generation to support fair future comparisons.
Related Articles
Which Version of Qwen 3.6 for M5 Pro 24g
Reddit r/LocalLLaMA

From Theory to Reality: Why Most AI Agent Projects Fail (And How Mine Did Too)
Dev.to

GPT-5.4-Cyber: OpenAI's Game-Changer for AI Security and Defensive AI
Dev.to

Building Digital Souls: The Brutal Reality of Creating AI That Understands You Like Nobody Else
Dev.to
Local LLM Beginner’s Guide (Mac - Apple Silicon)
Reddit r/artificial