Qwen 3.5 do I go dense or go bigger MoE?

Reddit r/LocalLLaMA / 3/18/2026

💬 OpinionDeveloper Stack & InfrastructureModels & Research

共有:

Key Points

The discussion centers on whether to scale AI models by going denser with MoE up to around 120B versus pursuing faster inference by upgrading memory bandwidth or VRAM.
The author currently runs Qwen 3.5 35B-a3b and 27B variants on a dual AMD 7900XT setup with roughly 40 GB VRAM, but finds the performance slower than desired for day-to-day coding tasks.
Upgrade options include a memory-over-bandwidth path (dual AMD 9700 AI Pro with 64 GB VRAM and 640 GB/s) to support very large MoE models, or a bandwidth-over-memory path (a single RTX5090 with ~1800 GB/s) to speed up the 27B model.
They are seeking practical advice on which path provides better real-world gains for their workload, weighing dense MoE scaling against faster, more compact models.

I have a workstation with dual AMAd 7900XT, so 40gb VRAM at 800gb/s it runs the likes of qwen3.5 35b-a3b, a 3-bit version of qwen-coder-next and qwen3.5 27b, slowly.

I love 27b it’s almost good enough to replace a subscription for day to day coding for me (the things I code are valuable to me but not extremely complex). The speed isn’t amazing though… I am of two minds here I could either go bigger, reach for the 122b qwen (and the nvidia and mistral models…) or I could try to speed up the 27b, my upgrade paths:

Memory over bandwidth: dual AMD 9700 ai pro, 64gb vram and 640 GB/s bandwidth. Great for 3-bit version of those ~120b MoE models

Bandwidth over memory: a single RTX5090 with 1800gb/s bandwidth, which would mean fast qwen3.5 27b

Any advice?

submitted by /u/Alarming-Ad8154
[link] [comments]

Astral to Join OpenAI

Dev.to

I Built a MITM Proxy to See What Claude Code Actually Sends to Anthropic

Dev.to

Your AI coding agent is installing vulnerable packages. I built the fix.

Dev.to

PearlOS. We gave swarm intelligence a local desktop environment and code control to self-evolve. Has been pretty incredible to see so far. Open source and free if you want your own.

Reddit r/LocalLLaMA

The Inference Market Is Consolidating. Agent Payments Are Still Nobody's Problem.

Dev.to

Qwen 3.5 do I go dense or go bigger MoE?

Key Points

Related Articles

Astral to Join OpenAI

I Built a MITM Proxy to See What Claude Code Actually Sends to Anthropic

Your AI coding agent is installing vulnerable packages. I built the fix.

PearlOS. We gave swarm intelligence a local desktop environment and code control to self-evolve. Has been pretty incredible to see so far. Open source and free if you want your own.

The Inference Market Is Consolidating. Agent Payments Are Still Nobody's Problem.

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer