Best coding LLM for Mac Mini M4 16GB? Currently using Qwen 3.5 9B

Reddit r/LocalLLaMA / 3/31/2026

💬 OpinionTools & Practical UsageModels & Research

Key Points

  • The post asks for recommendations of the best coding-focused local LLMs that can run smoothly on a Mac mini M4 with 16GB RAM and 256GB storage.
  • The author is currently using Qwen 3.5 9B, which works well for small coding tasks, quick fixes, and code explanations, but struggles with bigger files, multi-step logic changes, and more complex debugging.
  • A key constraint is that models larger than 9B do not run smoothly on their current hardware configuration.
  • The community is invited to suggest better coding models under 9B, including quantized options that work well with llama.cpp, Ollama, and Cline.
  • The intent is to gather real-world, hardware-constrained guidance rather than evaluate models in a purely theoretical way.

Hey everyone,

I’m using a Mac Mini M4 (16GB RAM, 256GB) for local coding LLMs.

Right now I’m using Qwen 3.5 9B, and honestly it is super good for its size. It works really well for small coding tasks, quick fixes, and code explanation.

But when it comes to medium-level tasks like handling bigger files, multi-step logic changes, or slightly complex debugging, the performance is not that good.

The main limitation is I can’t run models larger than 9B smoothly on my setup.

So I wanted suggestions from people using similar hardware:

  • Which model gives the best coding performance under 9B?
  • Is there any model better than Qwen 3.5 9B for coding in this size range?
  • Any good quantized model recommendations for llama.cpp / Ollama / Cline?

Would love to hear real-world suggestions.

submitted by /u/host3000
[link] [comments]