AI Navigate

What do you think about the possibility of this setup ?

Reddit r/LocalLLaMA / 3/21/2026

💬 OpinionDeveloper Stack & InfrastructureTools & Practical Usage

Key Points

  • The post proposes a cost-effective setup to locally run decent LLMs using 8 Nvidia V100 GPUs (16 GB each) in a 4028GR-TXRT or SYS-4028GR-TRT chassis, with custom water cooling and running at about 75% power for efficiency.
  • It tallies 128 GB of total VRAM across the GPUs and suggests not loading model weights into system RAM to avoid poor performance.
  • The author claims this configuration would be cheaper than alternatives like a RTX 5090 setup and potentially offer better performance on paper.
  • They ask the community for feedback on whether this is a waste of money and time and whether anyone has successfully tried a similar setup.

I want to locally run decent llms, the best cost effective setup i thought of is 8 v100 (16gb) on a 4028GR-TXRT for the x8 nvlink if i find a barebones one or a SYS-4028GR-TRT for 900 usd and run a custom watercooling setup with watercooling blocks from aliexpress (theyre around 35 usd each) and run the v100 setup at 75% power or lower for higher efficiency

the v100 cost 99usd including their heatsink, this setup has 128gb of vram and im planning on not putting any of the model's weights on the ram so it wont have abyssmally shit performance

it comes out cheaper than an rtx 5090 while having better performance (on paper)

has anyone tried this setup and can tell if its a waste of money and time ? its cheaper than a 128gb vram/lpddr ryzen halo max+ 395 or whatever its named

submitted by /u/lethalratpoison
[link] [comments]