So I recently bought a Mac (m2 max) with local llm use in mind and I did my research and everywhere everyone was saying go for the larger ram option or I will regret it later... So I did.
Time to choose a model:
"Okay, - Nice model, Qwen3.5 35b a3b running 8 bit quant, speedy even with full context size. -> Performance wise it's mediocre especially for more sophisticated agentic use"
"Hmm let me look for better options because I have 64 gbs maybe there is a smarter model out there. - Qwen3.5 27b mlx running at 4 bit quant (also full context size) is just the performance I need since it's a dense model. -> The catch is that, surprise surprise, it's slow so the agent takes up to 10 minutes just to create a folder structure"
So the dream would be like a 70 or 60b with active 9 or 7b model but there is none.
Essentially, they sit in this like awkward middle ground where they are too big for consumer hardware but not powerful enough to compete with those "frontier" giants.
It seems like there really is this gap between the mediocre models (35/27b) and the 'good' ones (>100b) because of that..
And my ram size (and performance) fits exactly into this gap, yippie 👍
But who knows what the future might hold especially with Google's research on turbo quant
what do you guys think or even recommend?
[link] [comments]




