HiCI: Hierarchical Construction-Integration for Long-Context Attention

arXiv cs.CL / 3/24/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces HiCI (Hierarchical Construction–Integration), a hierarchical attention module that explicitly builds segment-level representations, integrates them into a global context, and then conditions segment-level attention on both.
Experiments use parameter-efficient adaptation of LLaMA-2 with under 5.5% additional parameters, extending context length from 4K up to 100K tokens (7B) and 64K tokens (13B).
Across language modeling, retrieval, and instruction-following benchmarks, HiCI shows consistent gains over strong baselines, including competitive performance with proprietary models on topic retrieval.
The approach is described as adding an inductive bias that makes local-to-global information structuring more explicit for long-context modeling, yielding improvements even versus GPT-3.5-Turbo-16K on code comprehension.
Overall results suggest that explicit hierarchical structuring can be an effective architectural direction for long-context attention beyond raw token-level scalability.

Abstract

Long-context language modeling is commonly framed as a scalability challenge of token-level attention, yet local-to-global information structuring remains largely implicit in existing approaches. Drawing on cognitive theories of discourse comprehension, we propose HiCI (Hierarchical Construction--Integration), a hierarchical attention module that constructs segment-level representations, integrates them into a shared global context, and broadcasts both to condition segment-level attention. We validate HiCI through parameter-efficient adaptation of LLaMA-2 with only <5.5% additional parameters, extending context from 4K to 100K tokens (7B) and 64K tokens (13B). Across language modeling, retrieval, and instruction-following benchmarks, HiCI yields consistent improvements over strong baselines, including matching proprietary models on topic retrieval and surpassing GPT-3.5-Turbo-16K on code comprehension. These results demonstrate the effectiveness of explicit hierarchical structuring as an inductive bias for long-context modeling.

Santa Augmentcode Intent Ep.6

Dev.to

Your Agent Hired Another Agent. The Output Was Garbage. The Money's Gone.

Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Palantir’s billionaire CEO says only two kinds of people will succeed in the AI era: trade workers — ‘or you’re neurodivergent’

Reddit r/artificial

Scaffolded Test-First Prompting: Get Correct Code From the First Run

Dev.to

HiCI: Hierarchical Construction-Integration for Long-Context Attention

Key Points

Abstract

Related Articles

Santa Augmentcode Intent Ep.6

Your Agent Hired Another Agent. The Output Was Garbage. The Money's Gone.

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Palantir’s billionaire CEO says only two kinds of people will succeed in the AI era: trade workers — ‘or you’re neurodivergent’

Scaffolded Test-First Prompting: Get Correct Code From the First Run

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer