Lowkey disappointed with 128gb MacBook Pro

Reddit r/LocalLLaMA / 4/5/2026

💬 OpinionSignals & Early TrendsTools & Practical Usage

共有:

Key Points

A user asks how others are working with an M5 Max 128GB MacBook Pro for local coding LLM use, noting that 14-inch form factor likely isn’t the issue.
They report that Cursor’s “auto” model beats multiple locally downloaded Qwen and GLM options, suggesting variability in model choice and performance.
The user is hoping for community setup guidance, because their local generation starts around ~50 tok/s but then becomes extremely slow.
They describe themselves as new to the topic and request a beginner-friendly response, implying a learning curve around local LLM deployment and tuning.

How are you guys using your m5 Max 128gb pro’s? I have a 14 inch and I doubt the size is the issue but like I can’t seem to find any coding models that make sense locally. The “auto” model on cursor outperforms any of the Qwens and GLM I’ve downloaded. I haven’t tried the new Gemma yet but mainly it’s because I just am hoping someone could share their setup because I’m getting like 50 tok/s at first then it just gets unbelievably slow. I’m super new to this so please go easy on me 🙏

submitted by /u/F1Drivatar
[link] [comments]