Lowkey disappointed with 128gb MacBook Pro

Reddit r/LocalLLaMA / 4/5/2026

💬 OpinionSignals & Early TrendsTools & Practical Usage

Key Points

  • A user asks how others are working with an M5 Max 128GB MacBook Pro for local coding LLM use, noting that 14-inch form factor likely isn’t the issue.
  • They report that Cursor’s “auto” model beats multiple locally downloaded Qwen and GLM options, suggesting variability in model choice and performance.
  • The user is hoping for community setup guidance, because their local generation starts around ~50 tok/s but then becomes extremely slow.
  • They describe themselves as new to the topic and request a beginner-friendly response, implying a learning curve around local LLM deployment and tuning.

How are you guys using your m5 Max 128gb pro’s? I have a 14 inch and I doubt the size is the issue but like I can’t seem to find any coding models that make sense locally. The “auto” model on cursor outperforms any of the Qwens and GLM I’ve downloaded. I haven’t tried the new Gemma yet but mainly it’s because I just am hoping someone could share their setup because I’m getting like 50 tok/s at first then it just gets unbelievably slow. I’m super new to this so please go easy on me 🙏

submitted by /u/F1Drivatar
[link] [comments]