This is a request merge asked by some people on Reddit and HuggingFace. They don't have powerful GPUs and want to have big context window in uncensored smart local AI.
Model available here: https://huggingface.co/LuffyTheFox/Qwen3.5-9B-Claude-4.6-Opus-Uncensored-v2-GGUF
For best model perfomance please use following settings in LM Studio 0.4.7 (build 4):
- Use this System Prompt: https://pastebin.com/pU25DVnB
- Temperature: 0.7
- Top K Sampling: 20
- Repeat Penalty: (disabled) or 1.0
- Presence Penalty: 1.5
- Top P Sampling: 0.8
- Min P Sampling: 0.0
- Seed: 3407
Finally found a way to merge this amazing model made by Jackrong: https://huggingface.co/Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2-GGUF
With this uncensored model made by HauhauCS: https://huggingface.co/HauhauCS/Qwen3.5-9B-Uncensored-HauhauCS-Aggressive
And preserve all training data and accuracy on Qwen 3.5 9B architecture for weights in tensors via Float32 precision during merging process.
Now we have, the smallest, fastest and the smartest uncensored model trained on this dataset: https://huggingface.co/datasets/Roman1111111/claude-opus-4.6-10000x
On my RTX 3060 I got 42 tokens per second in LM Studio. On, llama-server it can run even more faster.
Enjoy, and share your results ^_^. Don't forget to upvote / repost so more people will test it.
[link] [comments]
