Qwen 3.6 35b a3b Q4 vs qwen 3.6 27b q6, on m5 pro 64gb

Reddit r/LocalLLaMA / 4/26/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage

Key Points

  • A user compared Qwen 3.6 35B A3B (4-bit) and Qwen 3.6 27B Q6/UD (6-bit) on a MacBook Pro M5 Pro 64GB using LM Studio with the MLX runtime and a 128K context window.
  • Speed benchmarks showed the 35B A3B model running about 70–72 tokens/sec (~11–16s for 800–1200 tokens), while the 27B variant ran about 9 tokens/sec (~32–70s), making the 35B roughly 8x faster in these tests.
  • On a four-task coding benchmark (auth hook, conflict resolution, delete account flow, and bug identification), the 35B A3B scored 9.8/10 overall versus 8.75/10 for the 27B model.
  • The tester concludes that on 64GB Apple Silicon for coding tasks, the 35B A3B delivers both higher quality and substantially better performance than the smaller dense 27B model, despite expectations about dense-model reasoning.
  • The results were generated with prompts curated in Claude, and the author notes they are not an expert, so configurations may not be optimal and feedback on improvements is requested.

Tried to test the two versions of models in my own m5 pro 64, curated the results on claude, not an expert so settings/config might not be the best. do share what results or improvements that can be attempted. test prompts were generated in claude for testing purposes.

Qwen3.6 35B A3B vs 27B UD — M5 Pro 64GB benchmark

Hardware: MacBook Pro M5 Pro 18-core · 64GB unified memory · LM Studio · MLX runtime · thinking OFF (/no_think) · 128K context

Specs

35B A3B MLX 4bit 27B UD MLX 6bit
Model size ~21.7GB ~30.5GB
Architecture MoE — 3B active/token Dense — 27B active/token
RAM at 128K ctx ~27GB ~38GB

Speed

Test 35B A3B 27B UD
800 token test ~72 tok/s · 11s ~9 tok/s · 32s
1200 token test ~70 tok/s · 16s ~9 tok/s · 70s
Advantage 8x faster baseline

Intelligence — 4-task coding benchmark

Task 35B A3B 27B UD
Auth hook (useRequireAuth) 9.5/10 — typed, mounted cleanup 8/10 — used any, no cleanup
Conflict resolution (500ms rules) 10/10 10/10
Delete account (ordered ops) 10/10 10/10
Bug identification (syncBatch) 10/10 — found 3 bugs + improvements 7/10 — found 1 bug
Overall 9.8/10 8.75/10

Test prompt: 4 coding tasks · max_tokens 1200 · temp 0.6 · /no_think system prompt

Verdict: 35B A3B wins on both speed and quality for coding tasks on 64GB Apple Silicon. 27B is slower (8x) and didn't demonstrate the reasoning depth advantage expected from a dense model on these tasks.

wanted to have some number/references when i was looking for mac to get, hopefully this helps someone out there.

submitted by /u/skyyyy007
[link] [comments]