Measuring the Representational Alignment of Neural Systems in Superposition
arXiv cs.LG / 4/2/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper shows that common neural representational alignment metrics (e.g., RSA, CKA, and linear regression) can be systematically deflated when networks encode features in superposition rather than one-to-one per neuron.
- It argues the misalignment scores primarily reflect differences in the systems’ superposition (projection/mixing) matrices rather than differences in the underlying latent features.
- Under partial feature overlap, the metric bias can even invert expected rankings, making less-shared-feature systems appear more aligned than more-shared-feature systems.
- The authors emphasize that superposition does not necessarily lose information, since compressed sensing can still allow recovery of original features when they are sparse.
- They conclude that accurate comparison of neural systems in superposition requires extracting and aligning the underlying features instead of comparing raw activation mixtures.
Related Articles

Black Hat Asia
AI Business
v5.5.0
Transformers(HuggingFace)Releases
Bonsai (PrismML's 1 bit version of Qwen3 8B 4B 1.7B) was not an aprils fools joke
Reddit r/LocalLLaMA
Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to

Inference Engines - A visual deep dive into the layers of an LLM
Dev.to