AI Navigate

M5 Max 128GB with three 120B models

Reddit r/LocalLLaMA / 3/19/2026

💬 OpinionTools & Practical UsageModels & Research

Key Points

  • The post compares three 120B-scale language models (Nemotron-3 Super, GPT-OSS 120B, and Qwen3.5 122B) on quality and speed.
  • Nemotron-3 Super is slightly higher in quality than GPT-OSS 120B, but GPT-OSS 120B is about twice as fast.
  • GPT-OSS 120B achieves roughly 77t/s, while Nemotron-3 Super and Qwen3.5 122B are around 35t/s.
  • Overall quality ranking is Nemotron-3 Super > GPT-OSS 120B > Qwen3.5 122B, implying trade-offs between speed and fidelity for practical use.
  • Nemotron-3 Super: Q4_K_M
  • GPT-OSS 120B: MXFP4
  • Qwen3.5 122B: Q4_K_M

Overall:

  • Nemotron-3 Super > GPT-OSS 120B > Qwen3.5 122B
  • Quality wise: Nemotron-3 Super is slightly better than GPT-OSS 120B, but GPT 120B is twice faster.
  • Speed wise, GPT-OSS 120B is twice faster than the other 2, 77t/s vs 35t/s ish
submitted by /u/albertgao
[link] [comments]