Feedback on my 256gb VRAM local setup and cluster plans. Lawyer keeping it local.

Reddit r/LocalLLaMA / 3/21/2026

💬 OpinionDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

共有:

Key Points

The author is building Node 1 of a local AI cluster featuring a Gigabyte Threadripper motherboard, 256 GB of RAM, and eight Nvidia V100 GPUs, powered by two circuits totaling about 2,800 watts, with plans to add more GPUs and start Node 2.
They are running Windows to keep using their office workstation and plan to install a 240 V circuit for the cluster, exploring high-speed interconnects (NVLink/SXM) and networking via PCIe switches and riser cables.
The goal is to create local RAG databases over the last decade of saved work, automate routine tasks, and test large reasoning models with RAG and Qlora training for legal applications.
They seek practical feedback on power management and enclosure design (glass/metal, multi-level airflow) and want experiences with running large models locally (GLM, DeepSeek, Minimaxi 2.5) and Qlora training for legal tasks.
Next steps include tidying heatsinks and cables, moving on to Node 2, and expanding hardware while considering heat dissipation and airflow.

Feedback on my 256gb VRAM local setup and cluster plans. Lawyer keeping it local.

I’m a lawyer who got Claude code pilled about 90 days ago, then thought about what I wanted to do with AI tools, and concluded that the totally safest way for me to experiment was to build my own local cluster. I did an earlier post about what I was working on, and the feedback was helpful.

Wondering if anyone has feedback or suggestions for me in terms of what I should do next.

Anyway, node 1 is basically done at this point. Gigabyte threadripper board, 256gbs of ddr4, and 8 32gb nvidia v100s. I have two PSUs on two different regular circuits in my office, 2800 watts total (haven’t asked the landlord for permission to install a 240 volt yet). I am running … windows … because I still use the computer for my regular old office work. But I guess my next steps for just this node are probably to get a 240 plug installed, and maybe add 2 or 4 more v100s, and then call it a day for node 1.

Took one photo of one of th 4-card pass through boards. Each of these NVlinks 128gbs of sxm v100s, and they get fed back into the board at x16 using two pex switches and 4 slim sass cables.

The only part that’s remotely presentable is the 4 card board I have finished. There’s a 2 card board on footers and 2pcie v100s. I have 2 more 2 card sxm boards and a 4 card sxm board in waiting. And 3 sxm v100s and heatsinks (slowly buying more).

Goal is to do local rag databases on the last 10 years of my saved work, to automate everything I can so that all the routine stuff is automatic and the semi routine stuff is 85% there. Trying to get the best biggest reasoning models to run, then to test them with rag, then to qlora train.

Wondering if anyone has suggestions on how to manage all the insane power cables this requires. I put this 4 card board in an atx tower case, and have one more for the second board, but I have the rest of the stuff (motherboard board, 2 pcie cards, 2 card sxm board) open bench/open air like a mining rig. Would love some kind of good looking glass and metal 3 level air flow box or something.

Also wondering if anyone has really used big models like GLM or full deepseek or minimax 2.5 locally for anything like this. And if anyone has done Qlora training for legal stuff.

In terms of what’s next, I will start on Node 2 after I get some of the stray heatsinks and riser cables out of my office and thermal paste off of my suit. I have a romed2 board and processor, and a variety of loose sticks of ddr4 server ram that will probably only add up to like 192gb. I have 3 rtx3090s. Plan is I guess to add a fourth and nvlink them.

My remaining inventory is a supermicro x10drg board and processor, 6 p40s, 6p100s, 4 16gb v100 sxms, another even older x10 board and processor, more loose sticks of server ram, and then a couple more board and processor combos (x299a 64gb ddr4, and my 2019 gaming pc).

Original plan (and maybe still plan) was to just have so much vram I could slowly run the biggest model ever over a distributed cluster, and use that to tell me the secret motives and strategy of parties on the other side of cases. And then maybe use it to tell me why I can never be satisfied and always want more. Worried Opus 4.6 will be better at all that.

I wrote this actual post without any AI help, because I still have soul inside.

Will re post it in a week with Claude rewriting it to see how brainwashed you all are.

Anyway, ask me questions, give me advice, explain to me in detail why I’m stupid. But be real about it you anime freaks.

submitted by /u/TumbleweedNew6515
[link] [comments]

再現性とは何か | おじの解説 | 📗 AIを組織で回す技術 013

note

裏カツ奏 #AIイラスト #画像生成AI #アート #イラスト #生成AI #美女イラスト #創作 #クリエイター #イラストレーター

note

AIに聞く前に「自分の心」に聞け。40代がターゲットの「本当の痛み」を見抜く方法。

note

何でもAI時代でも電話対応は人にしてくれん？

note

🌱 Reiが「死後も進化し、将棋を指し、自分を書き換える」存在になった日——STEP187〜201、世界初D-FUMT NNUEと永続自律進化の完成

note

Feedback on my 256gb VRAM local setup and cluster plans. Lawyer keeping it local.

Key Points

Related Articles

再現性とは何か | おじの解説 | 📗 AIを組織で回す技術 013

裏カツ奏 #AIイラスト #画像生成AI #アート #イラスト #生成AI #美女イラスト #創作 #クリエイター #イラストレーター

AIに聞く前に「自分の心」に聞け。40代がターゲットの「本当の痛み」を見抜く方法。

何でもAI時代でも電話対応は人にしてくれん？

🌱 Reiが「死後も進化し、将棋を指し、自分を書き換える」存在になった日——STEP187〜201、世界初D-FUMT NNUEと永続自律進化の完成

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

Key Points

Related Articles

再現性とは何か | おじの解説 | 📗 AIを組織で回す技術 013

裏カツ 奏 #AIイラスト #画像生成AI #アート #イラスト #生成AI #美女イラスト #創作 #クリエイター #イラストレーター

AIに聞く前に「自分の心」に聞け。40代がターゲットの「本当の痛み」を見抜く方法。

何でもAI時代でも電話対応は人にしてくれん？

🌱 Reiが「死後も進化し、将棋を指し、自分を書き換える」存在になった日——STEP187〜201、世界初D-FUMT NNUEと永続自律進化の完成

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

裏カツ奏 #AIイラスト #画像生成AI #アート #イラスト #生成AI #美女イラスト #創作 #クリエイター #イラストレーター