Beyond Activation Alignment: The Geometry of Neural Sensitivity

arXiv cs.LG / 5/6/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper shows that commonly used activation-alignment metrics (RSA, CCA, CKA) may miss differences in how networks use local stimulus evidence, because global agreement between linear readouts does not imply similarity in sensitivity to small perturbations.
  • It introduces a complementary framework that summarizes neural representations via local decodable information, using Fisher information and local representation geometry to characterize expected discriminability for perturbations within a chosen stimulus-coordinate subspace.
  • The approach defines a “second-moment” family of local discrimination tasks and computes an operator that serves as a minimal, complete summary of dataset-level expected discriminability.
  • It compares representations using a log-spectral distance over the manifold of symmetric positive definite (SPD) matrices, producing the Spectral Riemannian Alignment Score (S-RAS) and providing a multiplicative certificate for lifted task values.
  • Experiments demonstrate that the framework can match corresponding layers across independently trained neural networks, enable transferable class-conditional probing, distinguish standard vs. robust training behaviors, and detect stimulus-coordinate family effects in mouse visual cortex data.

Abstract

Activation-alignment measures such as Representational Similarity Analysis (RSA), Canonical Correlation Analysis (CCA), and Centered Kernel Alignment (CKA) are widely used to compare biological and artificial neural representations. Recent theoretical work interprets many of these methods as assessing agreement between optimal linear readouts over broad families of global tasks. However, agreement at the level of global readouts does not determine how a system uses local stimulus evidence. Specifically, representations may align in activation space yet differ in their sensitivity to small perturbations. To address this challenge, we introduce a complementary framework based on local decodable information, which focuses on a representation's ability, under noise, to discriminate small perturbations within a specified stimulus-coordinate subspace. Building on Fisher information and local representation geometry, we summarize each representation using the expected projected pullback/Fisher metric over that subspace. This formulation induces a second-moment family of local discrimination tasks, for which the resulting operator provides a minimal, complete dataset-level summary of expected discriminability. We compare these regularized signatures using a log-spectral distance on the manifold of symmetric positive definite (SPD) matrices, yielding the Spectral Riemannian Alignment Score (S-RAS) and a uniform multiplicative certificate over the corresponding family of lifted task values. Empirically, this framework enables the recovery of corresponding layers across independently trained artificial neural networks, supports transferable class-conditional probes, reveals controlled dissociations between standard and robust training, and uncovers stimulus-coordinate family effects across mouse visual cortex using the Allen Brain Observatory static gratings dataset.