Gating Enables Curvature: A Geometric Expressivity Gap in Attention
arXiv cs.LG / 4/17/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper analyzes gated attention using the geometry of representations, modeling attention outputs as mean parameters of Gaussian distributions and studying the resulting Fisher–Rao geometry.
- It proves that ungated (affine) attention is limited to intrinsically flat statistical manifolds, while multiplicative gating can realize non-flat geometries, including positively curved manifolds.
- The authors formalize a “geometric expressivity gap” showing that gated attention has strictly greater representational geometric capability than ungated attention.
- Empirical results link this geometry to behavior: gated models show higher representation curvature and better performance on tasks needing nonlinear decision boundaries, with no consistent gains for linear-boundary tasks.
- The study also finds a structured regime where curvature increases under repeated composition, producing a systematic depth amplification effect.
Related Articles
langchain-anthropic==1.4.1
LangChain Releases

🚀 Anti-Gravity Meets Cloud AI: The Future of Effortless Development
Dev.to

Talk to Your Favorite Game Characters! Mantella Brings AI to Skyrim and Fallout 4 NPCs
Dev.to

AI Will Run Companies. Here's Why That Should Excite You, Not Scare You.
Dev.to

The problem with Big Tech AI pricing (and why 8 countries can't afford to compete)
Dev.to