Pushed an abliterated Qwen3.6-35B-A3B to HF. Worth noting because MoE abliteration is genuinely different from dense — the refusal signal lives in the expert path, not attention, so standard Q/K/V LoRA doesn’t cut it.
Approach (Abliterix framework):
- LoRA rank-1 on O-proj + MLP down-proj (Q/K/V disabled on purpose)
- Expert-Granular Abliteration: project refusal direction across all 256 expert
down_projslices per layer - MoE router suppression: identified top-10 “safety experts”, router bias -2.10
- Orthogonalized steering vectors + Gaussian decay across layers
- Strength search in [0.5, 6.0] to avoid degenerate output
Eval: 7/100 refusals, KL 0.0189 from base. Baseline is 100/100. Judge is Gemini 3 Flash — degenerate/garbled output counts as refusal, no keyword matching, 150-token generations.
One thing worth saying since this comes up a lot: a bunch of abliterated model cards claim 0–3/100 refusals, and most are using 30–50 token generations + keyword detection. That undercounts delayed/soft refusals and lets garbled output pass as “compliant.” 7/100 is what a stricter LLM-judge eval actually gives you. Take the flashy numbers with salt.
huggingface/wangzhang/Qwen3.6-35B-A3B-abliterated
Research only. Safety guardrails removed — use responsibly.
[link] [comments]



![[2026] OpenTelemetry for LLM Observability — Self-Hosted Setup](/_next/image?url=https%3A%2F%2Fmedia2.dev.to%2Fdynamic%2Fimage%2Fwidth%3D1200%2Cheight%3D627%2Cfit%3Dcover%2Cgravity%3Dauto%2Cformat%3Dauto%2Fhttps%253A%252F%252Fdev-to-uploads.s3.amazonaws.com%252Fuploads%252Farticles%252Flu4b6ttuhur71z5gemm0.png&w=3840&q=75)