Opportunistic Bone-Loss Screening from Routine Knee Radiographs Using a Multi-Task Deep Learning Framework with Sensitivity-Constrained Threshold Optimization

arXiv cs.CV / 4/23/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsModels & Research

Key Points

  • The study introduces STR-Net, a multi-task deep learning framework that screens for osteoporosis/osteopenia using routine single-channel knee X-rays, avoiding additional imaging or patient visits.
  • STR-Net uses a shared convolutional backbone with routed task-specific heads for three outputs: binary normal vs. bone loss, severity classification (osteopenia vs. osteoporosis), and weakly coupled T-score regression (optionally with clinical variables).
  • A sensitivity-constrained threshold optimization was applied with a minimum sensitivity requirement of 0.86 to prioritize screening detection performance.
  • On a held-out test set, STR-Net reached strong performance for binary screening (AUROC 0.933, sensitivity 0.904, specificity 0.773, AUPRC 0.956) and reasonable severity classification (AUROC 0.898).
  • The T-score regression branch correlated well with DXA-derived T-scores in a small pilot subset (Pearson r=0.801, MAE 0.279, RMSE 0.347), but the authors emphasize the need for prospective clinical validation before deployment.

Abstract

Background: Osteoporosis and osteopenia are often undiagnosed until fragility fractures occur. Dual-energy X-ray absorptiometry (DXA) is the reference standard for bone mineral density (BMD) assessment, but access remains limited. Knee radiographs are obtained at high volume for osteoarthritis evaluation and may offer an opportunity for opportunistic bone-loss screening. Objective: To develop and evaluate a multi-task deep learning system for opportunistic bone-loss screening from routine knee radiographs without additional imaging or patient visits. Methods: We developed STR-Net, a multi-task framework for single-channel grayscale knee radiographs. The model includes a shared backbone, global average pooling feature aggregation, a shared neck, and a task-aware representation routing module connected to three task-specific heads: binary screening (Normal vs. Bone Loss), severity sub-classification (Osteopenia vs. Osteoporosis), and weakly coupled T-score regression with optional clinical variables. A sensitivity-constrained threshold optimization strategy (minimum sensitivity >= 0.86) was applied. The dataset included 1,570 knee radiographs, split at the patient level into training (n=1,120), validation (n=226), and test (n=224) sets. Results: On the held-out test set, STR-Net achieved an AUROC of 0.933, sensitivity of 0.904, specificity of 0.773, and AUPRC of 0.956 for binary screening. Severity sub-classification achieved an AUROC of 0.898. The T-score regression branch showed a Pearson correlation of 0.801 with DXA-measured T-scores in a pilot subset (n=31), with MAE of 0.279 and RMSE of 0.347. Conclusions: STR-Net enables single-pass bone-loss screening, severity stratification, and quantitative T-score estimation from routine knee radiographs. Prospective clinical validation is needed before deployment.