Finally Abliterated Sarvam 30B and 105B!

Reddit r/artificial / 4/9/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The author reports “abliterating” Sarvam-30B and Sarvam-105B, describing them as India’s first multilingual MoE reasoning models, and claims results that reveal their refusal behavior patterns.
They find that these reasoning models use two separate refusal mechanisms—one in the <think> block and another in the final response—which can conflict with each other.
The post argues that an English-computed direction can remove or suppress refusal behavior in multiple other supported languages (e.g., Malayalam, Hindi, Kannada), suggesting refusals may be pre-linguistic.
The write-up links to a full Medium article as well as Hugging Face releases for both the 30B and 105B “uncensored/abliterated” variants.

I abliterated Sarvam-30B and 105B - India's first multilingual MoE reasoning models - and found something interesting along the way!

Reasoning models have 2 refusal circuits, not one. The <think> block and the final answer can disagree: the model reasons toward compliance in its CoT and then refuses anyway in the response.

Killer finding: one English-computed direction removed refusal in most of the other supported languages (Malayalam, Hindi, Kannada among few). Refusal is pre-linguistic.

Full writeup: https://medium.com/@aloshdenny/uncensoring-sarvamai-abliterating-refusal-mechanisms-in-indias-first-moe-reasoning-model-b6d334f85f42

30B model: https://huggingface.co/aoxo/sarvam-30b-uncensored

105B model: https://huggingface.co/aoxo/sarvam-105b-uncensored

submitted by /u/Available-Deer1723
[link] [comments]