Semantic Sections: An Atlas-Native Feature Ontology for Obstructed Representation Spaces

arXiv cs.LG / 3/24/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that treating interpretability “features” as single global directions or shared latent coordinates can fail when representations are obstructed and locally coherent meanings do not globally assemble.
  • It introduces “semantic sections,” an atlas-native ontology: a transport-compatible family of local feature representatives defined over a context atlas, with formal properties distinguishing globalizable vs holonomy-obstructed (“twisted”) meanings.
  • The authors prove that tree-supported propagation is pathwise realizable and identify cycle consistency as the key criterion for true globalization of local semantics.
  • They propose a discovery-and-certification pipeline using seeded propagation, synchronization on overlaps, defect-based pruning, cycle-aware taxonomy, and deduplication, and apply it to layer-16 atlases for Llama 3.2 3B Instruct, Qwen 2.5 3B Instruct, and Gemma 2 2B IT.
  • Empirically, semantic identity cannot be reliably recovered by raw global-vector similarity, while section-based identity recovery is perfect on certified supports, supporting semantic sections as a better feature ontology in obstructed regimes.

Abstract

Recent interpretability work often treats a feature as a single global direction, dictionary atom, or latent coordinate shared across contexts. We argue that this ontology can fail in obstructed representation spaces, where locally coherent meanings need not assemble into one globally consistent feature. We introduce an atlas-native replacement object, the semantic section: a transport-compatible family of local feature representatives defined over a context atlas. We formalize semantic sections, prove that tree-supported propagation is always pathwise realizable, and show that cycle consistency is the key criterion for genuine globalization. This yields a distinction between tree-local, globalizable, and twisted sections, with twisted sections capturing locally coherent but holonomy-obstructed meanings. We then develop a discovery-and-certification pipeline based on seeded propagation, synchronization across overlaps, defect-based pruning, cycle-aware taxonomy, and deduplication. Across layer-16 atlases for Llama 3.2 3B Instruct, Qwen 2.5 3B Instruct, and Gemma 2 2B IT, we find nontrivial populations of semantic sections, including cycle-supported globalizable and twisted regimes after deduplication. Most importantly, semantic identity is not recovered by raw global-vector similarity. Even certified globalizable sections show low cross-chart signed cosine similarity, and raw similarity baselines recover only a small fraction of true within-section pairs, often collapsing at moderate thresholds. By contrast, section-based identity recovery is perfect on certified supports. These results support semantic sections as a better feature ontology in obstructed regimes.