AI Navigate

CORE: Robust Out-of-Distribution Detection via Confidence and Orthogonal Residual Scoring

arXiv cs.AI / 3/20/2026

📰 NewsModels & Research

Key Points

  • The paper identifies that OOD detection performance varies across architectures and datasets because logit-based methods rely only on classifier confidence while feature-based methods entangle confidence with membership in the full feature space.
  • They show that penultimate features decompose into an orthogonal classifier-aligned subspace (confidence) and a residual subspace (what the classifier discards), with the residual carrying a class-specific in-distribution membership signal unseen by logit-based methods.
  • CORE (COnfidence + REsidual) scores the two subspaces independently and combines them via normalized summation, leveraging their orthogonality to achieve more robust detection.
  • Across five architectures and five benchmark configurations, CORE achieves competitive or state-of-the-art results, ranking first in three of five settings and delivering the highest grand average AUROC with negligible computational overhead.
  • The approach improves reliability of OOD detection in real deployments by mitigating architecture- and dataset-specific failure modes without substantial extra cost.

Abstract

Out-of-distribution (OOD) detection is essential for deploying deep learning models reliably, yet no single method performs consistently across architectures and datasets -- a scorer that leads on one benchmark often falters on another. We attribute this inconsistency to a shared structural limitation: logit-based methods see only the classifier's confidence signal, while feature-based methods attempt to measure membership in the training distribution but do so in the full feature space where confidence and membership are entangled, inheriting architecture-sensitive failure modes. We observe that penultimate features naturally decompose into two orthogonal subspaces: a classifier-aligned component encoding confidence, and a residual the classifier discards. We discover that this residual carries a class-specific directional signature for in-distribution data -- a membership signal invisible to logit-based methods and entangled with noise in feature-based methods. We propose CORE (COnfidence + REsidual), which disentangles the two signals by scoring each subspace independently and combines them via normalized summation. Because the two signals are orthogonal by construction, their failure modes are approximately independent, producing robust detection where either view alone is unreliable. CORE achieves competitive or state-of-the-art performance across five architectures and five benchmark configurations, ranking first in three of five settings and achieving the highest grand average AUROC with negligible computational overhead.