Impact of Positional Encoding: Clean and Adversarial Rademacher Complexity for Transformers under In-Context Regression
arXiv stat.ML / 3/25/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper analyzes how positional encoding in a single-layer Transformer affects generalization in in-context regression, explicitly treating positional encoding as a fully trainable module.
- The authors show that positional encoding systematically increases the generalization gap between training and test performance.
- In an adversarial setting, they derive adversarial Rademacher complexity bounds and find that adversarial attack magnifies the performance gap between models with and without positional encoding.
- The study includes empirical simulations that validate the theoretical clean and adversarial generalization bounds.
- Overall, the work proposes a framework for understanding both robustness and generalization behavior of in-context learning with positional encodings.
Related Articles
AgentDesk vs Hiring Another Consultant: A Cost Comparison
Dev.to
"Why Your AI Agent Needs a System 1"
Dev.to
When should we expect TurboQuant?
Reddit r/LocalLLaMA
AI as Your Customs Co-Pilot: Automating HS Code Chaos in Southeast Asia
Dev.to
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions
Dev.to