AI Navigate

Nvidia B100 is essentially H100 w/ HBM3E + Key Perf metrics of B200/B300

Reddit r/LocalLLaMA / 3/17/2026

💬 OpinionIdeas & Deep AnalysisIndustry & Market Moves

Key Points

  • The author estimates B200 FP16 Tensor Core performance as 1191.2 TFLOPS based on 18,944 cores and a 1965 MHz boost clock.
  • Using Nvidia technical briefs, the article infers that B100 is essentially an H100 with HBM3e VRAM and FP4 support, while B200 is a larger Hopper H100 with HBM3e and FP4 support.
  • The B300 retains most of B200’s performance but trims FP64 and TC INT8 while boosting TC FP4 by 50%, resulting in 14.29 PFLOPS for TC FP4 versus 9.53 PFLOPS on B200, trading scientific/finance workloads for AI-optimized FP4.
  • The piece concludes that Blackwell appears to be a bigger Hopper/Ada family with TC FP4 support, supported by Nvidia docs and the cited Reddit analyses.

Since Nvidia is very vague about the actual spec of the Blackwell pro cards, after some detective work, I am able to deduce the actual theoretical tensor core (TC) performance for the Nvidia B100/B200/B300 chips. I suppose it would be useful for the billionaires here. ;)

From the numbers in this reddit page from a person who has access to B200:

https://www.reddit.com/r/nvidia/comments/1khwaw5/battle_of_the_giants_nvidia_blackwell_b200_takes/

We can tell that number of cores of B200 is 18944 and boost clock speed is 1965MHz. This gives a FP16 Tensor Core dense performance of 1191.2TFLOPS.

From these three official Nvidia docs and the numbers I just got:

https://cdn.prod.website-files.com/61dda201f29b7efc52c5fbaf/6602ea9d0ce8cb73fb6de87f_nvidia-blackwell-architecture-technical-brief.pdf
https://resources.nvidia.com/en-us-blackwell-architecture|
https://resources.nvidia.com/en-us-blackwell-architecture/blackwell-ultra-datasheet

We can deduce that essentially, B100 is an H100 with HBM3e VRAM and FP4 support.

B200 is a bigger Hopper H100 with HBM3e and FP4 support.

B300 has exactly the same performances as B200 except for FP64, TC FP4 and TC INT8. B300 is sort of like a mix of B200 and B202 used in 5090. It cuts FP64 and TC INT8 performance to 5090 level and to make room for TC FP4 such that TC FP4 receives a boost of 50%. This translates to TC FP4 dense at 14.29PFLOPS vs 9.53PFLOPS in B200.

B300 is a B200 but with 50% boost in FP4 makes it more suitable for AI workload but the cut in FP64 makes it not suitable for scientific/finance workload.

This fits my understanding that blackwell is just a bigger Hopper/Ada with TC FP4 support.

submitted by /u/Ok_Warning2146
[link] [comments]