| $25k in hardware. tell me what you want me to load on them and i'll help test. currently running GLM 5.1 Q4 on each (troubleshooting why exo isn't loading the Q8 version) patiently awaiting kimi2.6 for when the community optimizes it for MLX/mmap [link] [comments] |
2x 512gb ram M3 Ultra mac studios
Reddit r/LocalLLaMA / 4/21/2026
💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage
Key Points
- A Reddit user describes having invested about $25k in Mac Studio hardware with dual 512GB RAM and offers to help test workloads based on user requests.
- They report running DeepSeek V3.2 at Q8 using an Exo backend and are currently troubleshooting why Exo isn’t loading a Q8 version while running GLM 5.1 at Q4 on each machine.
- The user is waiting for Kimi 2.6 and expects the community to optimize it for MLX/mmap, suggesting ongoing local LLM inference experimentation.
- The post emphasizes practical, hands-on testing of local LLM models and quantization behavior on high-memory Apple Silicon systems.
Related Articles

Black Hat USA
AI Business

Black Hat Asia
AI Business

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to

10 ChatGPT Prompts Every Genetic Counselor Should Be Using in 2025
Dev.to

The Memory Wall Can't Be Killed — 3 Papers Proving Every Architecture Hits It
Dev.to