Hello everyone. So, some people asked me to do the merge for Qwen 3.5-35 A3B model. Because it has only 3 active billion parameters and can run on old GPU (RTX 3060 12GB)
Introducing: https://huggingface.co/LuffyTheFox/Qwen3.5-35B-A3B-Uncensored-Claude-Opus-4.6-Affine
This model has been made via merging:
- The most popular model by HauhauCS on HuggingFace: https://huggingface.co/HauhauCS/Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive
- And Qwen 3.5 35B A3B Claude Opus 4.6 distilled model by Jackrong: https://huggingface.co/Jackrong/Qwen3.5-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled
- After merging I ran a special script that, added the "thinking skills" from Jackrong model to HauhauCS model. Cleaned up any weirdness using a math method called KL divergence. Did all of this in Google Colab Free Tier without unpacking the model - it stayed in the compressed IQ4_XS format.
Also I fixed:
- The very first layer (blk.0) - this handles raw input, so it often gets messy
- A few late layers (blk.35, blk.39) - these handle final output and often show problems after compression
- Attention and expert parts - these are the most sensitive parts of the model
Results:
17-18 tokens per second on my RTX 3060 12 GB without offloading. With skills in programming, writing, and human like short, natural and simple communication, without censorship.
For best model perfomance please use following settings in LM Studio 0.4.7 (build 4):
- Use this System Prompt: https://pastebin.com/pU25DVnB
- If you want to disable thinking use this chat template in LM Studio: https://pastebin.com/uk9ZkxCR
- Temperature: 0.7
- Top K Sampling: 20
- Repeat Penalty: (disabled) or 1.0
- Presence Penalty: 1.5
- Top P Sampling: 0.8
- Min P Sampling: 0.0
- Seed: 3407
Here model programming skills in action: https://pastebin.com/44VtLGxf
Via prompt:
"Write an Arkanoid game using HTML5 and Javascript. The game should be controlled with a mouse and include generated sounds and effects. The game should be in the style of the film Tron: Legacy."
I hope you like it ^_^. Please upvote if you like the model, so more people will see it.
Frankly saying this is best local AI I ever used in my practice. And I am very impressed with the results.
[link] [comments]




