Currently using 6x RTX 3080 - Moving to Strix Halo oder Nvidia GB10 ?

Reddit r/LocalLLaMA / 3/13/2026

💬 OpinionDeveloper Stack & InfrastructureTools & Practical Usage

共有:

Key Points

The author operates a 6x RTX 3080 20GB GPU server and is seeking ways to reduce power consumption for around-the-clock use.
They are considering Strix Halo or Nvidia GB10 DGX Spark clones as replacements, noting bandwidth and compute power trade-offs.
GB10 advantages include potential FP4 performance and preserving the CUDA environment, but expansion is limited to a single M.2 SSD (likely needing a second GB10 for more capacity).
Strix-Halo or Ryzen AI 395 Max offer about half the cost and may allow adding a second GPU via PCIe slots or additional M.2, but there are concerns about Vulkan/ROCM ecosystems and multi-GPU complexity.
The post also mentions Apple M5 Max on MacBook Pro as a potential alternative with favorable power-perf values, and asks for experiences and hints from others.

I am from a country with costly electric power. I really like my 6x RTX 3080 20GB GPU-Server, but the power consumption - especially when running for 24x7 or 14x7 Hours, it is quite intense.

I have been lurking a long time on buying a strix halo ( Yeah, their prices gone up ) or even a DGX Spark or one of its cheaper clones. It's clear to me that I am losing compute power, as the bandwidth is indeed smaller.

Since I am using more and more agents, which can run around the clock, it is not that important for me to have very fast token generation, but prompt processing is getting more and more important as the context is increasing with more agentic use cases.

My thoughts:

GB10 (Nvidia DGX Spark or Clones)

- May be good performance when using fp4 while still having a fair quality
- Keeping the CUDA Environment
- Expansion is limited due to single and short m.2 SSD - except for buying a second GB10

Strix-Halo / Ryzen AI 395 Max
- Nearly 50% cheaper than GB10 Clones
- Possibly a hacky solution to add a second GPU as many models offer PCIe Slots ( Minisforum, Framework) or a second x4 m.2 Slot (Bosgame M5) to be able to increase capacity and speed when tuning the split-modes.
- I am afraid of the vulkan/rocm eco-system and multiple GPUs if required.

Bonus Thoughts: What will be coming out from Apple in the summer? The M5 Max on Macbook Pro (Alex Ziskind Videos) showed that even the Non-Ultra Mac do offer quite nice PP values when compared to Strix-Halo and GB10.

What are your thoughts on this, and what hints and experiences could you share with me?

submitted by /u/runsleeprepeat
[link] [comments]