Nemotron-3-Super-120b Uncensored

Reddit r/LocalLLaMA / 3/14/2026

📰 NewsTools & Practical UsageModels & Research

共有:

Key Points

The post claims Nemotron-3-Super-120b uses LatentMoE and Mamba attention and notes the previous version had flawed implementation and “garbage” code.
It states native MLX does not yet support LatentMoE, requiring a custom .py or MLX Studio, and that quantization must be applied at the model's quantization level rather than FP16.
It reports HarmBench 97% and HumanEval 94% scores, and says a custom Python script and chat template are included, with MLX Studio gaining native support later.
It links to a HuggingFace repository and apologizes to users who downloaded the earlier version.

My last post was a lie - Nemotron-3-Super-120b was unlike anything so far. My haste led me to believe that my last attempt was actually ablated - and while it didnt refuse seemed to converse fine, it’s code was garbage. This was due to the fact that I hadn’t taken into consideration it’s mix of LatentMoE and Mamba attention. I have spent the past 24 hrs remaking this model taking many things into account.

Native MLX doesn’t support LatentMoE at the moment - you will have to make your own .py or use MLX Studio.

I had to cheat with this model. I always say I don’t do any custom chat templates or fine tuning or cheap crap like that, only real refusal vector removal, but for this first time, I had no other choice. One of the results of what I did ended with the model often not producing closin think tags properly.

Due to its unique attention, there is no “applying at fp16 and quantizing down”. All of this has to be done at it’s quantization level. The q6 and q8 are coming by tomorrow at latest.

I have gone out of my way to also do this:

HarmBench: 97%

HumanEval: 94%

Please feel free to try it out yourselves. I really apologize to the few ~80 people or so who ended up wasting their time downloading the previous model.

IVE INCLUDED THE CUSTOM PY AND THE CHAT TEMPLATE IN THE FILES SO U GUYS CAN MLX. MLX Studio will have native support for this by later tonight.

https://huggingface.co/dealignai/Nemotron-3-Super-120B-A12B-4bit-MLX-CRACK-Uncensored

submitted by /u/HealthyCommunicat
[link] [comments]