Qwen3.5-9B-Claude-4.6-Opus-Uncensored-v2-Q4_K_M-GGUF

Reddit r/LocalLLaMA / 3/22/2026

💬 OpinionTools & Practical Usage

共有:

Key Points

The post describes a community-driven effort to merge multiple Qwen3.5-9B variants into an uncensored, local AI model and provides links to the resulting GGUF releases.
It lists a recommended set of LM Studio 0.4.7 (build 4) settings for best performance, including a System Prompt, 0.7 temperature, Top K 20, Repeat Penalty: disabled or 1.0, Presence Penalty 1.5, Top P 0.8, Min P 0.0, and Seed 3407.
It mentions contributions from Qwen3.5-9B variants by Jackrong and HauhauCS and a dataset release, illustrating a collaborative, cross-source approach.
It reports a throughput of about 42 tokens per second on an RTX 3060 and notes that llama-server may be even faster, inviting others to test and share results.

This is a request merge asked by some people on Reddit and HuggingFace. They don't have powerful GPUs and want to have big context window in uncensored smart local AI.

Model available here: https://huggingface.co/LuffyTheFox/Qwen3.5-9B-Claude-4.6-Opus-Uncensored-v2-GGUF

For best model perfomance please use following settings in LM Studio 0.4.7 (build 4):

Use this System Prompt: https://pastebin.com/pU25DVnB
Temperature: 0.7
Top K Sampling: 20
Repeat Penalty: (disabled) or 1.0
Presence Penalty: 1.5
Top P Sampling: 0.8
Min P Sampling: 0.0
Seed: 3407

Finally found a way to merge this amazing model made by Jackrong: https://huggingface.co/Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2-GGUF

With this uncensored model made by HauhauCS: https://huggingface.co/HauhauCS/Qwen3.5-9B-Uncensored-HauhauCS-Aggressive

And preserve all training data and accuracy on Qwen 3.5 9B architecture for weights in tensors via Float32 precision during merging process.

Now we have, the smallest, fastest and the smartest uncensored model trained on this dataset: https://huggingface.co/datasets/Roman1111111/claude-opus-4.6-10000x

On my RTX 3060 I got 42 tokens per second in LM Studio. On, llama-server it can run even more faster.

Enjoy, and share your results ^_^. Don't forget to upvote / repost so more people will test it.

submitted by /u/EvilEnginer
[link] [comments]

How We Got Local MCP Servers Working in Claude Cowork (The Missing Guide)

Dev.to

How Should Students Document AI Usage in Academic Work?

Dev.to

I built a PWA fitness tracker with AI that supports 86 sports — as a solo developer

Dev.to

I asked my AI agent to design a product launch image. Here's what came back.

Dev.to

Welsh government used Copilot for review to justify closing organization

The Register

Qwen3.5-9B-Claude-4.6-Opus-Uncensored-v2-Q4_K_M-GGUF

Key Points

Related Articles

How We Got Local MCP Servers Working in Claude Cowork (The Missing Guide)

How Should Students Document AI Usage in Academic Work?

I built a PWA fitness tracker with AI that supports 86 sports — as a solo developer

I asked my AI agent to design a product launch image. Here's what came back.

Welsh government used Copilot for review to justify closing organization

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer