CoQuant: Joint Weight-Activation Subspace Projection for Mixed-Precision LLMs
arXiv cs.LG / 4/30/2026
📰 NewsDeveloper Stack & InfrastructureModels & Research
Key Points
- The paper introduces CoQuant, a PTQ method for mixed-precision LLMs that jointly considers both weight and activation quantization noise rather than relying only on activation statistics.
- CoQuant uses a theoretical formulation of expected output error to derive a closed-form, weighted PCA solution for selecting an optimal high-precision subspace.
- Experiments on Llama-3.2 and Qwen2.5 demonstrate consistent improvements over strong PTQ baselines, measured via WikiText perplexity and zero-shot common-sense reasoning accuracy.
- The work provides code for implementation, supporting adoption and further validation of the joint subspace modeling approach in low-bit LLM quantization.
Related Articles
The Prompt Caching Mistake That's Costing You 70% More Than You Need to Pay
Dev.to
We Built a DNS-Based Discovery Protocol for AI Agents — Here's How It Works
Dev.to
Building AI Evaluation Pipelines: Automating LLM Testing from Dataset to CI/CD
Dev.to

Function Calling Harness 2: CoT Compliance from 9.91% to 100%
Dev.to
Stop Building Signal APIs. Build Systems That Prove Themselves Wrong.
Dev.to