| Been using this for a few days. It is BY FAR the best uncensored model I have found for Qwen 3.6 35B. With IQ4XS, Q8 KVcache, 262K context, it fits in 24GB of VRAM and does not fail on multi turn tool calls. I honeslty feel like it is smarter than the original model (call me crazy). The model also has a very low KLD so it should in theory be similar to the orignal model on harmless prompts. llmfan's 3.5 35B model does actually benchmark higher than the original in the UGI NatInt section, so I have a solid hunch this 3.6 35B will also benchmark higher than the original 3.6 model as well. Y'all should give it a try. [link] [comments] |
Qwen3.6 35B A3B Heretic (KLD 0.0015!) Incredible model. Best 35B I have found!
Reddit r/LocalLLaMA / 4/26/2026
💬 OpinionSignals & Early TrendsTools & Practical UsageModels & Research
Key Points
- A Reddit user reports that the Qwen 3.6 35B “A3B Heretic” uncensored model is the best uncensored Qwen 3.6 35B they have found, based on several days of testing.
- The tester claims it runs well on limited hardware, fitting in 24GB VRAM with IQ4XS, Q8 KV cache, and a 262K context window, while handling multi-turn tool calls without failing.
- They note the model has a very low KLD (0.0015), suggesting it may behave similarly to the original model on harmless prompts.
- The user also expects improved benchmark performance, citing that an llmfan 3.5 35B version benchmarks higher than the original in the UGI NatInt section, and hypothesizing similar gains for this 3.6 35B variant.
- They encourage others to try the model, linking to the Hugging Face release by llmfan46.
Related Articles

Black Hat USA
AI Business

Your Agent Isn't Reflecting. It's Performing Reflection.
Dev.to

The Context Window Is a Lie
Dev.to

7 Transaction Types Your AI Agent Can Execute: From Transfers to Contract Deployment
Dev.to

Day 7 of Building GoDavaii: Why My Grandmother's Four Medicines Inspired India's Health AI
Dev.to