Why MOE below A10b feels like im gambling

Reddit r/LocalLLaMA / 4/22/2026

💬 OpinionIdeas & Deep AnalysisTools & Practical Usage

共有:

Key Points

The author says recent Mixture-of-Experts (MOE) models can be fast, but they often suffer from lower coherence unless they have sufficiently high active parameters (around 10B+ active per token).
In their experience, MOE code models like qwen3-coder-next and the qwen3.x 35B variants did not match the stability they saw with qwen3.5-27B.
They report that a smaller MOE (e.g., A3B) may require more “hand-holding” and multi-turn steering, partly because it tries to use tools from the coding harness that are not relevant to the task.
The author observes higher variability in next actions for the 35B-A3B MOE compared with a 27B dense model, making it harder to integrate into an agentic workflow.
Overall, they like MOE in principle but struggle to find a reliable use case when MOE sizes fall below A10B, feeling the results are closer to “gambling.”

We've seen lots of MOE's coming out recently. While these do phenominal work at speed you pay the price in coherence.. unless the MOE has at least 10b active-per-token.
I often coded with these models and have been trying many different models the most recent i've found is:
qwen3-coder-next, qwen3.5-35b, qwen3.6-35b
and none of them come close to the level of stability i witnessed in qwen3.5-27b even qwen3.6-35b-A3b??

WhileThe A3b MOE can solve the problem he often needs hand-holding and multi-turn steering. the A3b often try to use tools avalible in the Coding Harness that doesn't apply to the problem hes trying to fix. so i often have to manually disable some tools to keep him focuses while the 27b would intuitively sucessfully ignore the irrelavent tools ETC. This is just one example. But the variability of what the model will chosse to do next is hugely varied with active 35b-A3b compared to 27b dense. I would like to use the MOE but im struggloing to find a usecase for where i would put it in my agentic workflow.

Edit: english is hard. but u get what im saying? at least i'll leave the typos as proof this isnt a bot account. LOL

submitted by /u/Express_Quail_1493
[link] [comments]

Black Hat USA

AI Business

Autoencoders and Representation Learning in Vision

Dev.to

Every AI finance app wants your data. I didn’t trust that — so I built my own. Offline.

Dev.to

Control Claude with Just a URL. The Chrome Extension "Send to Claude" Is Incredibly Useful

Dev.to

Google Stitch 2.0: Senior-Level UI in Seconds, But Editing Still Breaks

Dev.to

Why MOE below A10b feels like im gambling

Key Points

Related Articles

Black Hat USA

Autoencoders and Representation Learning in Vision

Every AI finance app wants your data. I didn’t trust that — so I built my own. Offline.

Control Claude with Just a URL. The Chrome Extension "Send to Claude" Is Incredibly Useful

Google Stitch 2.0: Senior-Level UI in Seconds, But Editing Still Breaks

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer