AI MAX 395+ w/ 128 GB or dual 3090s?

Reddit r/LocalLLaMA / 4/13/2026

💬 OpinionIdeas & Deep AnalysisTools & Practical Usage

Key Points

  • The post weighs an AI MAX 395+ setup with 128 GB of VRAM against using dual NVIDIA 3090 GPUs, focusing on practical inference speed rather than just maximum model size.
  • The author argues that when larger-context or bigger-model capabilities are needed, cloud-hosted LLMs are often a more efficient path than paying for local hardware meant for infrequent peak needs.
  • Dual 3090s are presented as offering a better speed-to-performance tradeoff for real-time tasks, particularly for agentic workflows.
  • The thread asks other users for opinions, implying there is no clear consensus and decisions likely depend on workload patterns and latency requirements.

I like the idea of the 395+ with 128 gb vram, but the speed on inference with bigger models just makes it seem like its not worth it. I feel like if you ever need the capabilities of a bigger model, you can just use a cloud lm to do so.

Whereas with dual 3090s , you get a decent size model with lots of speed, which is far better for use cases such as agentic workflows.

What do you guys think?

submitted by /u/Engineering_Acq
[link] [comments]