Learning Coarse-to-Fine Osteoarthritis Representations under Noisy Hierarchical Labels

arXiv cs.CV / 5/4/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper studies knee osteoarthritis (OA) assessment using a clinically motivated hierarchy: a coarse binary OA presence label and a fine-grained Kellgren–Lawrence (KL) severity grade.
  • Instead of treating OA presence and KL severity as independent tasks, the authors test whether this label hierarchy can act as a representation-level supervisory prior.
  • They employ a simple dual-head neural network (shared encoder with two task-specific heads) as a probe, comparing single-task and dual-head training across multiple 3D backbones under the same evaluation protocol.
  • The study finds that dual-head supervision yields backbone-dependent improvements, particularly improving KL-related metrics for certain backbones.
  • Beyond accuracy, the gains correlate with more ordered coarse-to-fine latent representations and improved alignment of model saliency with cartilage regions, suggesting an inductive bias for severity grading under noisy labels.

Abstract

Knee osteoarthritis (OA) assessment involves a natural but often underused label hierarchy: a coarse binary OA decision and a fine-grained Kellgren--Lawrence (KL) severity grade. Existing deep learning studies commonly treat these targets as separate classification problems, either reducing OA assessment to disease presence or directly optimizing noisy ordinal KL labels. In this work, we ask whether this clinical hierarchy can serve as a representation-level supervisory prior. Rather than introducing a complex architecture, we use a deliberately simple dual-head model with a shared encoder and two task-specific heads as a probe of hierarchical supervision. We compare single-OA, single-KL, and dual-head training across multiple 3D backbones under the same test protocol. Beyond standard classification metrics, we perform paired statistical comparisons, analyze latent severity-axis geometry, and examine saliency overlap with cartilage regions. The results show that dual-head supervision produces backbone-dependent gains, with clear improvements in KL-related metrics for selected backbones. More importantly, the gains are accompanied by a more ordered coarse-to-fine latent organization and, for responsive backbones, stronger anatomical alignment of saliency with cartilage. These findings suggest that even simple hierarchical dual-head supervision can reshape disease representations under noisy coarse/fine labels, providing a useful inductive bias for OA diagnosis and severity grading.