Silhouette Loss: Differentiable Global Structure Learning for Deep Representations

arXiv cs.AI / 4/13/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes Soft Silhouette Loss, a differentiable objective inspired by the classical silhouette coefficient, to improve embedding geometry by encouraging intra-class compactness and inter-class separation beyond standard cross-entropy (CE).
  • Unlike pairwise or proxy-based metric learning methods, Soft Silhouette Loss compares each sample against all classes in the batch to capture a batch-level notion of global structure while staying lightweight.
  • Soft Silhouette Loss can be combined directly with CE and is also compatible with supervised contrastive learning (SupCon), enabling a hybrid loss that optimizes both local pairwise consistency and global cluster structure.
  • Experiments across seven datasets show consistent gains: CE+Soft Silhouette Loss outperforms CE and other metric learning baselines, the hybrid approach beats SupCon alone, and the best method raises top-1 accuracy from 36.71% (CE) and 37.85% (SupCon2) to 39.08% with substantially lower overhead than more complex metric-learning setups.

Abstract

Learning discriminative representations is a central goal of supervised deep learning. While cross-entropy (CE) remains the dominant objective for classification, it does not explicitly enforce desirable geometric properties in the embedding space, such as intra-class compactness and inter-class separation. Existing metric learning approaches, including supervised contrastive learning (SupCon) and proxy-based methods, address this limitation by operating on pairwise or proxy-based relationships, but often increase computational cost and complexity. In this work, we introduce Soft Silhouette Loss, a novel differentiable objective inspired by the classical silhouette coefficient from clustering analysis. Unlike pairwise objectives, our formulation evaluates each sample against all classes in the batch, providing a batch-level notion of global structure. The proposed loss directly encourages samples to be closer to their own class than to competing classes, while remaining lightweight. Soft Silhouette Loss can be seamlessly combined with cross-entropy, and is also complementary to supervised contrastive learning. We propose a hybrid objective that integrates them, jointly optimizing local pairwise consistency and global cluster structure. Extensive experiments on seven diverse datasets demonstrate that: (i) augmenting CE with Soft Silhouette Loss consistently improves over CE and other metric learning baselines; (ii) the hybrid formulation outperforms SupCon alone; and (iii) the combined method achieves the best performance, improving average top-1 accuracy from 36.71% (CE) and 37.85% (SupCon2) to 39.08%, while incurring substantially lower computational overhead. These results suggest that classical clustering principles can be reinterpreted as differentiable objectives for deep learning, enabling efficient optimization of both local and global structure in representation spaces.