CT-to-X-ray Distillation Under Tiny Paired Cohorts: An Evidence-Bounded Reproducible Pilot Study

arXiv cs.CV / 4/1/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The study investigates whether CT images can be used only as training-time supervision to distill a binary disease/no-disease chest X-ray classifier, eliminating the need for CT at inference.
Using patient-level paired data and a teacher–student distillation setup, the authors find that a stripped-down plain cross-modal logit-KD baseline outperforms the more complex JDCNet variant on a small four-image validation subset.
Eight Monte Carlo patient-level resampling experiments suggest results are sensitive to dataset splits, with late fusion achieving the best mean accuracy while different strategies perform best for macro-F1 and balanced accuracy.
Stronger mechanism controls (attention transfer and feature hints) do not reliably restore a robust cross-modality advantage, highlighting likely failure modes in the cross-modality transfer.
The paper’s main contribution is presented as a reproducible, evidence-bounded pilot protocol that clarifies the task definition, instability in rankings, and minimum requirements for future credible CT-to-X-ray claims rather than a new validated architecture.

Abstract

Chest X-ray and computed tomography (CT) provide complementary views of thoracic disease, yet most computer-aided diagnosis models are trained and deployed within a single imaging modality. The concrete question studied here is narrower and deployment-oriented: on a patient-level paired chest cohort, can CT act as training-only supervision for a binary disease versus non-disease X-ray classifier without requiring CT at inference time? We study this setting as a cross-modality teacher--student distillation problem and use JDCNet as an executable pilot scaffold rather than as a validated superior architecture. On the original patient-level paired split from a public paired chest imaging cohort, a stripped-down plain cross-modal logit-KD control attains the highest mean result on the four-image validation subset (0.875 accuracy and 0.714 macro-F1), whereas the full module-augmented JDCNet variant remains at 0.750 accuracy and 0.429 macro-F1. To test whether that ranking is a split artifact, we additionally run eight patient-level Monte Carlo resamples with same-case comparisons, stronger mechanism controls based on attention transfer and feature hints, and imbalance-sensitive analyses. Under this resampled protocol, late fusion attains the highest mean accuracy (0.885), same-modality distillation attains the highest mean macro-F1 (0.554) and balanced accuracy (0.660), the plain cross-modal control drops to 0.500 mean balanced accuracy, and neither attention transfer nor feature hints recover a robust cross-modality advantage. The contribution of this study is therefore not a validated CT-to-X-ray architecture, but a reproducible and evidence-bounded pilot protocol that makes the exact task definition, failure modes, ranking instability, and the minimum requirements for future credible CT-to-X-ray transfer claims explicit.

Show HN: 1-Bit Bonsai, the First Commercially Viable 1-Bit LLMs

Dev.to

I Built an AI Agent That Can Write Its Own Tools When It Gets Stuck

Dev.to

Agent Self-Discovery: How AI Agents Find Their Own Wallets

Dev.to

[P] Federated Adversarial Learning

Reddit r/MachineLearning

The Inversion Error: Why Safe AGI Requires an Enactive Floor and State-Space Reversibility

Towards Data Science

CT-to-X-ray Distillation Under Tiny Paired Cohorts: An Evidence-Bounded Reproducible Pilot Study

Key Points

Abstract

Related Articles

Show HN: 1-Bit Bonsai, the First Commercially Viable 1-Bit LLMs

I Built an AI Agent That Can Write Its Own Tools When It Gets Stuck

Agent Self-Discovery: How AI Agents Find Their Own Wallets

[P] Federated Adversarial Learning

The Inversion Error: Why Safe AGI Requires an Enactive Floor and State-Space Reversibility

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer