| Built this beautiful monstrosity to satisfy my mental illness. Running gptoss 120b at 90t/s, qwen 3.5 35b a3b at 80 t/s. This node is running host for my RPC mesh with the two 64gb orin dev kits [link] [comments] |
Newest GPU server in the lab! 72gb ampere vram!
Reddit r/LocalLLaMA / 3/19/2026
📰 NewsDeveloper Stack & InfrastructureModels & Research
Key Points
- A new GPU server with 72 GB Ampere VRAM was built in the lab to support large AI models.
- It is reportedly running gptoss 120b at 90t/s and qwen 3.5 35b a3b at 80 t/s.
- The node serves as the host for an RPC mesh with two 64 GB Orin development kits.
- The post was submitted by /u/braydon125 on Reddit's LocalLLaMA and links to a video.
Related Articles

Astral to Join OpenAI
Dev.to

I Built a MITM Proxy to See What Claude Code Actually Sends to Anthropic
Dev.to

Your AI coding agent is installing vulnerable packages. I built the fix.
Dev.to

PearlOS. We gave swarm intelligence a local desktop environment and code control to self-evolve. Has been pretty incredible to see so far. Open source and free if you want your own.
Reddit r/LocalLLaMA

The Inference Market Is Consolidating. Agent Payments Are Still Nobody's Problem.
Dev.to