Contrastive Semantic Projection: Faithful Neuron Labeling with Contrastive Examples

arXiv cs.LG / 4/27/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • Neuron labeling methods that rely on highly activating examples can produce overly broad or misleading descriptions because they may latch onto incidental dominant visual factors.
  • The paper extends contrastive explanation ideas to neuron-level labeling by generating candidate labels with vision-language models (VLMs) using contrastive image sets and then assigning labels via a CLIP-like pipeline.
  • It proposes Contrastive Semantic Projection (CSP), integrating contrastive examples into the SemanticLens-style scoring and selection process to improve how labels are chosen.
  • Experiments—including a melanoma detection case study—show that contrastive labeling increases both faithfulness to the neuron’s true semantics and semantic granularity compared with state-of-the-art approaches.
  • The authors argue that contrastive examples are a simple, underused ingredient that can materially strengthen interpretability and analysis pipelines for deep networks.

Abstract

Neuron labeling assigns textual descriptions to internal units of deep networks. Existing approaches typically rely on highly activating examples, often yielding broad or misleading labels by focusing on dominant but incidental visual factors. Prior work such as FALCON introduced contrastive examples -- inputs that are semantically similar to activating examples but elicit low activations -- to sharpen explanations, but it primarily addresses subspace-level interpretability rather than scalable neuron-level labeling. We revisit contrastive explanations for neuron-level labeling in two stages: (1) candidate label generation with vision language models (VLMs) and (2) label assignment with CLIP-like encoders. First, we show that providing contrastive image sets to VLMs yields candidate labels that are more specific and more faithful. Second, we introduce Contrastive Semantic Projection (CSP), an extension of SemanticLens that incorporates contrastive examples directly into its CLIP-based scoring and selection pipeline. Across extensive experiments and a case study on melanoma detection, contrastive labeling improves both faithfulness and semantic granularity over state-of-the-art baselines. Our results demonstrate that contrastive examples are a simple yet powerful and currently underutilized component of neuron labeling and analysis pipelines.