AdaHOP: Fast and Accurate Low-Precision Training via Outlier-Pattern-Aware Rotation
arXiv cs.LG / 4/6/2026
📰 NewsDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research
Key Points
- The paper finds that using a fixed Hadamard transform for low-precision training in LLMs is ineffective because outlier structures differ across weights, activations, and gradients and require different “smoothing directions.”
- It classifies outlier patterns into three types—row-wise, column-wise, and none—and shows that each pattern pair calls for tailored Hadamard direction or outlier-handling strategies to reduce quantization error.
- AdaHOP is introduced to adaptively choose, per matrix multiplication, between Inner Hadamard Transform (IHT) and IHT plus selective Outlier Extraction (OE) that routes dominant outliers through a higher-precision path.
- With hardware-aware Triton kernels, the method reportedly achieves BF16 training quality at MXFP4 precision, while providing up to 3.6× memory compression and 1.8× kernel acceleration versus BF16 full-precision training.
💡 Insights using this article
This article is featured in our daily AI news digest — key takeaways and action items at a glance.
Related Articles

How Bash Command Safety Analysis Works in AI Systems
Dev.to

How I Built an AI Agent That Earns USDC While I Sleep — A Complete Guide
Dev.to

How to Get Better Output from AI Tools (Without Burning Time and Tokens)
Dev.to

How I Added LangChain4j Without Letting It Take Over My Spring Boot App
Dev.to

The Future of Artificial Intelligence in Everyday Life
Dev.to