| NVIDIA admits to only 2x performance boost from Rubin at max throughput, which is what 99% of companies are running in production anyway. No more sandbagging comparing chips with 80GB vram to 288GB vram. They're forced to compare apples for apples. Despite Rubin having almost 3x the memory bandwidth and apparently 5x the FP4 perf, that results in only 2x the output throughput. At 1000W TDP for B200 vs 2300W R200. So you're using 2.3x the power per GPU to get 2x performance. Not really efficient, is it? [link] [comments] |
NVIDIA admits to only 2x performance boost at max throughput with new generation of Rubin GPUs
Reddit r/LocalLLaMA / 3/17/2026
📰 NewsIndustry & Market Moves
Key Points
- NVIDIA reportedly acknowledges that Rubin GPUs deliver roughly a 2x throughput gain at max throughput, despite higher memory bandwidth and FP4 performance claims.
- The post notes the need for apples-to-apples benchmarking, rather than comparing chips with very different VRAM configurations.
- With 1000W TDP for the B200 versus 2300W for the R200, Rubin would use about 2.3x the power to achieve roughly 2x the performance, raising efficiency questions.
- Since most production deployments run at max throughput, the real-world impact may be limited and could influence purchasing decisions.




