Structural Sensitivity in Compressed Transformers: Error Propagation, Lyapunov Stability, and Formally Verified Bounds
arXiv cs.LG / 3/24/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The study finds extreme structural sensitivity in compressed transformers, where a single matrix (from GPT-2 Small) can increase perplexity by about 20,000x, indicating sensitivity spans roughly five orders of magnitude.
- Across five transformer architectures (117M to 8B parameters), the authors identify a consistent hierarchy of compression fragility: early-layer MLP up-projection matrices are catastrophically sensitive, while value projections can compress with minimal performance loss.
- Using Lyapunov stability theory, the paper argues that residual connections help contract compression-induced errors by growing the hidden state faster than the error, providing a theoretical mechanism for partial tolerance.
- The authors show error contraction alone is insufficient: architecture-specific redundancy also matters, illustrated by a hybrid model whose degradation is far smaller than expected despite higher measured error amplification.
- The work includes ten machine-checked Lean 4 theorems that formally bound per-matrix error propagation with no unproven steps, plus empirical validation via robustness indexing (Compression Fragility Index) and downstream task benchmarks.
Related Articles
How AI is Transforming Dynamics 365 Business Central
Dev.to
Algorithmic Gaslighting: A Formal Legal Template to Fight AI Safety Pivots That Cause Psychological Harm
Reddit r/artificial
Do I need different approaches for different types of business information errors?
Dev.to
ShieldCortex: What We Learned Protecting AI Agent Memory
Dev.to
How AI-Powered Revenue Intelligence Transforms B2B Sales Teams
Dev.to