H-Probes: Extracting Hierarchical Structures From Latent Representations of Language Models

arXiv cs.CL / 5/5/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces H-probes, a set of linear probe methods designed to extract hierarchical information—such as depth and pairwise distances—from language model latent representations.
Experiments on synthetic tree-traversal tasks show that H-probes can reliably identify the subspaces that encode the hierarchical structure needed to solve the tasks.
Ablation studies indicate that the hierarchy-containing subspaces are low-dimensional and causally important for strong task performance, with some generalization both in-domain and out-of-domain.
The authors also find weaker but analogous hierarchical structure in real-world hierarchical reasoning settings, including mathematical reasoning traces, suggesting hierarchy is represented beyond surface syntax.
Overall, the results suggest that language models encode hierarchical structure at deeper levels of abstraction, potentially including aspects of the reasoning process itself.

Abstract

Representing and navigating hierarchy is a fundamental primitive of reasoning. Large language models have demonstrated proficiency in a wide variety of tasks requiring hierarchical reasoning, but there exists limited analysis on how the models geometrically represent the necessary latent constructions for such thinking. To this end, we develop \textit{H-probes}, a collection of linear probes that extract hierarchical structure, specifically depth and pairwise distance, from latent representations. In synthetic tree traversal tasks, the H-probes robustly find the subspaces containing hierarchical structure necessary to complete the tasks; furthermore, in comprehensive ablation experiments, we show that these hierarchy-containing subspaces are low-dimensional, causally important for high task performance, and generalize within- and out-of-domain. Furthermore, we find analogous, though weaker, hierarchical structure in real-world hierarchical contexts such as mathematical reasoning traces. These results demonstrate that models represent hierarchy not only at the level of syntax and concepts, but at deeper levels of abstraction -- including the reasoning process itself.