| Happy to announce TQ3_4S. https://huggingface.co/YTan2000/Qwen3.5-27B-TQ3_4S Please note: on median PPL, Q3_K_S has slight edge. [link] [comments] |
Turbo Quant on weight x2 speed
Reddit r/LocalLLaMA / 4/2/2026
📰 NewsSignals & Early TrendsTools & Practical UsageModels & Research
Key Points
- A new quantized model variant, TQ3_4S, is announced as part of “Turbo Quant,” claiming about 2x faster inference while maintaining the same model size compared with TQ3_1S.
- The author reports that TQ3_4S delivers better quality than TQ3_1S, positioning it as an improvement for local LLM quantized deployments.
- The article links to the Hugging Face model page for “Qwen3.5-27B-TQ3_4S,” making the artifact readily available for testing.
- Despite the claimed improvements, the author notes that on median PPL, the reference model Q3_K_S still has a slight edge, and further tuning is planned for future releases.
💡 Insights using this article
This article is featured in our daily AI news digest — key takeaways and action items at a glance.
Related Articles

Black Hat USA
AI Business

Black Hat Asia
AI Business

Z.ai Launches GLM-5V-Turbo: A Native Multimodal Vision Coding Model Optimized for OpenClaw and High-Capacity Agentic Engineering Workflows Everywhere
MarkTechPost

How I Started Using AI Agents for End-to-End Testing (Autonoma AI)
Dev.to

How We Built an AI Coach That Understands PTSD — And Why It Matters
Dev.to