BATQuant: Outlier-resilient MXFP4 Quantization via Learnable Block-wise Optimization
arXiv cs.CL / 3/18/2026
💬 OpinionModels & Research
Key Points
- BATQuant introduces a Block-wise Affine Transformation that confines rotations to MXFP4 granularity to prevent cross-block outlier propagation and preserve local quantization behavior.
- It relaxes orthogonality constraints and uses Global and Private Kronecker (GPK) decomposition to reduce parameter storage and runtime overhead.
- Block-wise Learnable Clipping is incorporated to suppress residual outliers and shape activation distributions more effectively.
- Extensive experiments on multimodal LLMs and LLMs show state-of-the-art results under aggressive W4A4KV16 quantization, recovering up to 96.43% of full-precision performance on multimodal benchmarks.
Related Articles
Data Augmentation Using GANs
Dev.to
Speculative Policy Orchestration: A Latency-Resilient Framework for Cloud-Robotic Manipulation
arXiv cs.RO
Automatic Debiased Machine Learning for Smooth Functionals of Nonparametric M-Estimands
arXiv stat.ML
Preference-Guided Debiasing for No-Reference Enhancement Image Quality Assessment
arXiv cs.CV
Model Selection and Parameter Estimation of Multi-dimensional Gaussian Mixture Model
arXiv stat.ML