Qwen 3.6 35b a3b Q4 tips

Reddit r/LocalLLaMA / 4/24/2026

💬 OpinionSignals & Early TrendsTools & Practical Usage

Key Points

  • A Reddit user shares their local setup using opencode CLI with LM Studio to run Qwen 3.6 35B (A3B, Q4) on a MacBook Pro (64GB RAM), reporting roughly 55–70 tokens per second and about 35GB RAM usage.
  • They observe that, with Codex-like code review by Qwen, they reach around 90% completion-quality, but the model sometimes overlooks one or two details.
  • The post asks for practical tips to improve code quality further, including whether the user might be better off switching to Qwen 3.6 27B instead.
  • Overall, the discussion is framed as troubleshooting and optimization of local LLM-assisted coding workflows rather than announcing any new model or tool release.

Currently using opencode cli with lm studio, qwen 3.6 35b a3b q4, running on mac 5pro 64gb, at 55-70tps, ram uses about 35gb

With this setup and codex reviewing the work by qwen, qwen is achieving about 90% of completion quality, tend to overlook one or two things.

Anyone got tips on how to better improve the code quality or am I doing something wrong, or if I should try to use the new qwen 3.6 27b instead?

submitted by /u/skyyyy007
[link] [comments]