Linear Representations of Hierarchical Concepts in Language Models

arXiv cs.CL / 4/10/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper studies whether language models internally encode hierarchical relations between concepts (e.g., Japan ⊂ Eastern Asia ⊂ Asia) and how this encoding manifests in their representations.
It extends “Linear Relational Concepts” by training linear transformations for each hierarchical depth and semantic domain, then comparing transformations to characterize differences tied to hierarchy.
Experiments show that, within a given domain, hierarchical relations can be linearly recovered from model representations, including for multi-token entities and across layers.
The analysis finds hierarchy information resides in a relatively low-dimensional subspace, which is often domain-specific, yet the learned hierarchy representation is highly similar across those domain-specific subspaces.
Overall, the authors argue that concept hierarchies in tested models are captured via highly interpretable linear representations, with results supported by in-domain generalization and cross-domain transfer evaluations.

Abstract

We investigate how and to what extent hierarchical relations (e.g., Japan

\subset

Eastern Asia

\subset

Asia) are encoded in the internal representations of language models. Building on Linear Relational Concepts, we train linear transformations specific to each hierarchical depth and semantic domain, and characterize representational differences associated with hierarchical relations by comparing these transformations. Going beyond prior work on the representational geometry of hierarchies in LMs, our analysis covers multi-token entities and cross-layer representations. Across multiple domains we learn such transformations and evaluate in-domain generalization to unseen data and cross-domain transfer. Experiments show that, within a domain, hierarchical relations can be linearly recovered from model representations. We then analyze how hierarchical information is encoded in representation space. We find that it is encoded in a relatively low-dimensional subspace and that this subspace tends to be domain-specific. Our main result is that hierarchy representation is highly similar across these domain-specific subspaces. Overall, we find that all models considered in our experiments encode concept hierarchies in the form of highly interpretable linear representations.

Black Hat Asia

AI Business

GLM 5.1 tops the code arena rankings for open models

Reddit r/LocalLLaMA

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

My Bestie Built a Free MCP Server for Job Search — Here's How It Works

Dev.to

can we talk about how AI has gotten really good at lying to you?

Reddit r/artificial

Linear Representations of Hierarchical Concepts in Language Models

Key Points

Abstract

Related Articles

Black Hat Asia

GLM 5.1 tops the code arena rankings for open models

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

My Bestie Built a Free MCP Server for Job Search — Here's How It Works

can we talk about how AI has gotten really good at lying to you?

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer