Structural Sensitivity in Compressed Transformers: Error Propagation, Lyapunov Stability, and Formally Verified Bounds

arXiv cs.LG / 2026/3/24

💬 オピニオンIdeas & Deep AnalysisModels & Research

共有:

要点

The study finds extreme structural sensitivity in compressed transformers, where a single matrix (from GPT-2 Small) can increase perplexity by about 20,000x, indicating sensitivity spans roughly five orders of magnitude.
Across five transformer architectures (117M to 8B parameters), the authors identify a consistent hierarchy of compression fragility: early-layer MLP up-projection matrices are catastrophically sensitive, while value projections can compress with minimal performance loss.
Using Lyapunov stability theory, the paper argues that residual connections help contract compression-induced errors by growing the hidden state faster than the error, providing a theoretical mechanism for partial tolerance.
The authors show error contraction alone is insufficient: architecture-specific redundancy also matters, illustrated by a hybrid model whose degradation is far smaller than expected despite higher measured error amplification.
The work includes ten machine-checked Lean 4 theorems that formally bound per-matrix error propagation with no unproven steps, plus empirical validation via robustness indexing (Compression Fragility Index) and downstream task benchmarks.

Abstract

A single matrix out of 468 in GPT-2 Small can increase perplexity by 20,000x when compressed, revealing that transformer compression sensitivity spans five orders of magnitude. We map this sensitivity landscape across five architectures (117M-8B parameters), finding a consistent hierarchy: early-layer MLP up-projections are catastrophically sensitive while value projections compress nearly for free. This hierarchy is stable across compression levels, evaluation scales (2K-51K tokens), and datasets (WikiText-103, C4). Using Lyapunov stability theory, we show that residual connections contract compression errors by growing the hidden state faster than the error. Error contraction is necessary but not sufficient for compression tolerance: architecture-specific redundancy plays an equally important role, as demonstrated by the hybrid LFM2-2.6B degrading only 7x despite higher amplification than the fully-contracting GPT-2 Small (120x). Ten machine-checked Lean 4 theorems formalize per-matrix error bounds with no sorry markers; all bounds produce zero violations across 14,040+ configurations. We validate with downstream task evaluation (HellaSwag, ARC-Easy, Winogrande), activation-aware pruning on two architectures, and a Compression Fragility Index that rank-orders model robustness.

[野球の予測モデル] 次の1球で何が起こるのかを予測したい

Qiita

なんと397BのAIモデルをiPhoneで動かすことに成功

GIGAZINE

AI研究におけるボトルネックは人間

GIGAZINE

クレタ人のLLM

Zenn

生成AIが「下手な鉄砲」型サイバー攻撃を増やす、足元固めを急ごう

日経XTECH

Structural Sensitivity in Compressed Transformers: Error Propagation, Lyapunov Stability, and Formally Verified Bounds

要点

Abstract

関連記事

[野球の予測モデル] 次の1球で何が起こるのかを予測したい

なんと397BのAIモデルをiPhoneで動かすことに成功

AI研究におけるボトルネックは人間

クレタ人のLLM

生成AIが「下手な鉄砲」型サイバー攻撃を増やす、足元固めを急ごう

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer