2x Asus Ascent GX10 - MiniMax M2.7 AWQ - cloud providers are dead to me

Reddit r/LocalLLaMA / 4/15/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage

Key Points

  • Reddit投稿者が、手元のAsus Ascent GX10を2台に増設し、ローカルでOpus級の“エージェント的コーディング”を目指した結果について報告している。
  • さまざまなモデル(例:Qwen 3.5 122B/397B、MiniMax M2.5 AWQなど)を試したが、信頼性や要件適合の面で満足できないものが多かった一方、MiniMax M2.5/2.7は「エージェントワークホース」として有効だったという。
  • MiniMax M2.7 AWQは新しいライセンスの影響があるものの、計画・課題理解・機能開発・バグ修正などにおいて、検証(テストやplaywright-cli等)を組み合わせることで成果を出せると述べている。
  • ローカル運用の現実としてGPT-5.4/Opus 4.6のような徹底さは期待できないが、“動いて仕事になる”水準に到達したためクラウド提供に依存しなくなった、と結論づけている。
  • 構成面では熱設計の課題に触れ、2台を積むより机上に平置きする方がよいとの具体的な観察も示されている。
2x Asus Ascent GX10 - MiniMax M2.7 AWQ - cloud providers are dead to me

Hello,

I've been on a quest to get something "close enough" of Opus 4.5 running locally, for agentic coding, as SWE with 15 years of experience.

I tried with one spark (yeah I'm calling my Asus Ascent GX10 sparks - they're the same), with models like Qwen 3.5 122B-A10B, Qwen3-Coder-Next, M2.5-REAP, ... Nothing was scratching the itch, too much frustration. 128GB is simply not enough (for me) right now.

So I bought a second one (first one I paid 2800€, second one 2500€, plus 60€ cable - total 5360€ - that's without VAT because it's a business expense, so I get VAT back).

First I tried Qwen 3.5 397B-A17B thinking it would be "it". But it's not. It's not bad, it's just not up to the task of being a reliable agentic coworker. I found it a bit eager to say "it's done!".

Then I tried MiniMax M2.5 AWQ. 130GB for the Q4 version. Lots of room for KV-cache. It's slower than Qwen 3.5 397-A17B and doesn't have vision.

But oh boy is it a good agentic workhorse.

Then came M2.7 with its new license (that is clearly made to fight against shady inference providers, which I agree with - not made to fight against us) and while it's not light and day with M2.5, it's the best model I've used.

I've set it up with my own harness (an OpenCode-like interface that I've customized for my use case), and as long as I give it a way to verify its work, it delivers (either through tests or through using the playwright-cli).

It's amazing at planning, understanding issues, developing new features, fixing bugs... All the thing you'd expect.

Sure it's not perfect, but it IS close enough and fast enough. It does frustrate me from time to time, just like proprietary SOTA models do as well.

That does require to readjust your expectation a bit though, you can't expect the same thoroughness of GPT-5.4 or the sheriff attitude of Opus 4.6. It's different, it's local but it WORKS.

So I'm calling it, cloud providers are dead to me. 2x Spark is a great setup and with M2.7 I've got a solid agent working for me.

(they actually have quite bad thermals, stacking them is not optimal, they now lay flat on a desk)

PS: I have to pay my respects to the MiniMax team. They understand how to pack a great SWE in 229B parameters, while GLM-5.1 is at 754B (40B active), Kimi K2.5 at 1T (32B active), these guys understand compute. It's a win to be able to have such a smart agent in such a "small" footprint. They don't do it for us, they do it for themselves to provide great inference without as much compute as OpenAI/Anthropic/ZAI/Moonshot.

---

References:

submitted by /u/t4a8945
[link] [comments]