Finally Abliterated Sarvam 30B and 105B!

Reddit r/artificial / 4/9/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The author reports “abliterating” Sarvam-30B and Sarvam-105B, describing them as India’s first multilingual MoE reasoning models, and claims results that reveal their refusal behavior patterns.
  • They find that these reasoning models use two separate refusal mechanisms—one in the <think> block and another in the final response—which can conflict with each other.
  • The post argues that an English-computed direction can remove or suppress refusal behavior in multiple other supported languages (e.g., Malayalam, Hindi, Kannada), suggesting refusals may be pre-linguistic.
  • The write-up links to a full Medium article as well as Hugging Face releases for both the 30B and 105B “uncensored/abliterated” variants.

I abliterated Sarvam-30B and 105B - India's first multilingual MoE reasoning models - and found something interesting along the way!

Reasoning models have 2 refusal circuits, not one. The <think> block and the final answer can disagree: the model reasons toward compliance in its CoT and then refuses anyway in the response.

Killer finding: one English-computed direction removed refusal in most of the other supported languages (Malayalam, Hindi, Kannada among few). Refusal is pre-linguistic.

Full writeup: https://medium.com/@aloshdenny/uncensoring-sarvamai-abliterating-refusal-mechanisms-in-indias-first-moe-reasoning-model-b6d334f85f42

30B model: https://huggingface.co/aoxo/sarvam-30b-uncensored

105B model: https://huggingface.co/aoxo/sarvam-105b-uncensored

submitted by /u/Available-Deer1723
[link] [comments]