PolarQuant: Optimal Gaussian Weight Quantization via Hadamard Rotation for LLM Compression
arXiv cs.CL / 4/1/2026
📰 NewsIdeas & Deep AnalysisTools & Practical UsageModels & Research
Key Points
- PolarQuant is a post-training LLM weight quantization method designed for near-lossless compression by reshaping weight distributions before quantization.
- The approach normalizes weights block-wise to a unit hypersphere, applies a Walsh-Hadamard rotation to make coordinates approximately Gaussian, then quantizes using centroids matched to that Gaussian distribution.
- Ablation results show Hadamard rotation alone drives about 98% of the quality gains, improving Qwen3.5-9B perplexity from 6.90 (absmax Q5) to 6.40 (very close to FP16, with Δ = +0.03) without calibration data.
- PolarQuant also serves as a preprocessing step that improves downstream INT4 quantization (torchao INT4), achieving lower perplexity (6.56 vs 6.68) while keeping strong throughput (43.1 tok/s at ~6.5 GB VRAM).
- The authors provide public code and models, indicating the method is intended to be directly testable and reusable in compression/quantization pipelines.




