Hey everyone,
I’m using a Mac Mini M4 (16GB RAM, 256GB) for local coding LLMs.
Right now I’m using Qwen 3.5 9B, and honestly it is super good for its size. It works really well for small coding tasks, quick fixes, and code explanation.
But when it comes to medium-level tasks like handling bigger files, multi-step logic changes, or slightly complex debugging, the performance is not that good.
The main limitation is I can’t run models larger than 9B smoothly on my setup.
So I wanted suggestions from people using similar hardware:
- Which model gives the best coding performance under 9B?
- Is there any model better than Qwen 3.5 9B for coding in this size range?
- Any good quantized model recommendations for llama.cpp / Ollama / Cline?
Would love to hear real-world suggestions.
[link] [comments]




