NVIDIA B100
Last reviewed
May 31, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v2 · 2,589 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 31, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v2 · 2,589 words
Add missing citations, update stale details, or suggest a clearer explanation.
The NVIDIA B100 is a data center graphics processing unit (GPU) based on the Blackwell architecture, announced by NVIDIA CEO Jensen Huang at the GTC 2024 keynote on March 18, 2024.[1] Positioned as the lower-power, drop-in compatible Blackwell variant for existing H100 HGX server platforms, the B100 was specified at 700 W thermal design power (TDP) with 192 GB of HBM3e memory and approximately 14 petaFLOPS of sparse FP4 tensor performance.[2][3] Despite its initial billing as the "first to ship" Blackwell SKU,[4] the B100 was substantially deprioritized over the course of 2024 amid manufacturing issues with NVIDIA's CoWoS-L packaging, and KeyBanc Capital Markets analyst John Vinh reported in late August 2024 that NVIDIA had "effectively canceled" the B100 in favor of the B200 (for hyperscalers) and a then-forthcoming B200A variant (for enterprise customers).[5][6] By the time Blackwell-based servers began shipping to large customers in volume in December 2024, NVIDIA's allocations were concentrated almost entirely on the GB200 NVL72 rack-scale system rather than HGX B100 boards.[7][8]
NVIDIA's Blackwell architecture, named after American statistician David Blackwell, is the successor to the Hopper architecture used in the H100 and H200 data center accelerators.[9] Blackwell was first confirmed during an NVIDIA investor presentation in October 2023, and its data center accelerators, publicly named B100 and B200, were formally introduced at GTC 2024 on March 18, 2024.[9][1] The architecture marked the first time NVIDIA used a multi-die (chiplet) design in a flagship data center GPU: each Blackwell package consists of two reticle-limited dies fabricated on TSMC's custom 4NP process and connected by a 10 TB/s NV-High Bandwidth Interface (NV-HBI), yielding a total transistor count of 208 billion (104 billion per die).[10][9]
Three primary data center products were unveiled at GTC 2024: the air-cooled B100, the higher-power B200, and the GB200 Grace Blackwell Superchip, which combines two B200 GPUs with one Grace CPU on a single board.[1][11] All three share the same Blackwell silicon and use TSMC's CoWoS-L 2.5D packaging, which embeds passive silicon bridges within an organic redistribution layer (RDL) interposer.[9][12]
The B100 was specified as a Blackwell GPU configured for the same 700 W per-GPU power envelope as the H100 and H200 it was intended to replace, allowing HGX B100 baseboards to be drop-in compatible with existing HGX H100 server infrastructure.[3][2] Per most published spec sheets:[2][13][14][15]
| Attribute | NVIDIA B100 (SXM) |
|---|---|
| Architecture | Blackwell (dual GB100 dies via NV-HBI) |
| Process node | TSMC 4NP (custom variant of N4P) |
| Transistors | 208 billion (104 B per die) |
| Memory | 192 GB HBM3e (8 stacks x 24 GB) |
| Memory bandwidth | 8 TB/s |
| Memory bus | 8192-bit (8 x 1024-bit HBM3e interfaces) |
| NVLink (5th gen) | 1.8 TB/s bidirectional per GPU |
| Chip-to-chip (NV-HBI) | 10 TB/s between dies |
| Tensor FP4 (dense / sparse) | 7 / 14 PFLOPS |
| Tensor FP8/FP6 (dense / sparse) | 3.5 / 7 PFLOPS |
| Tensor INT8 (dense / sparse) | 3.5 / 7 POPS |
| Tensor FP16/BF16 (dense / sparse) | 1.75–1.8 / 3.5 PFLOPS |
| FP64 (dense) | ~30 TFLOPS |
| TDP | 700 W (air-cooled) |
| Socket | SXM6 |
| PCIe interface | PCIe Gen 6 |
| Form factor | 8-GPU HGX B100 baseboard |
The B100's distinguishing trait within the Blackwell lineup was its thermal envelope: at 700 W, it delivered roughly the same per-GPU power draw as the H100, but with 2.4x the HBM memory (192 GB vs. 80 GB), 2.5x the memory bandwidth (8 TB/s vs. 3.2 TB/s), and substantially higher AI throughput.[2][14] Air-cooled HGX B100 systems were quoted as delivering 14 PFLOPS of FP4 per GPU at the same 700 W envelope as Hopper.[11]
By comparison, the B200 used the same silicon but raised TDP to 1,000 W, increasing dense FP4 throughput to 9 PFLOPS (18 PFLOPS sparse) and dense FP16/BF16 to 2.25 PFLOPS.[16][17] The liquid-cooled GB200, which pairs two B200 GPUs with a Grace CPU, ran each Blackwell GPU at up to ~1,200 W and delivered up to 20 PFLOPS FP4 per GPU.[11]
Unlike the monolithic H100, both the B100 and B200 are physically composed of two GB100 dies sitting side-by-side on a CoWoS-L interposer.[10][9] The two dies are joined by NV-HBI, an internal 10 TB/s interconnect that NVIDIA describes as based on the NVLink 7 protocol and that allows the software stack to see the package as a single logical GPU.[9] This was the first time NVIDIA shipped a multi-die flagship data center GPU; AMD had previously taken a similar approach with the MI250X and MI300 accelerators.
Each die in a Blackwell package interfaces with four HBM3e stacks of 24 GB and 1 TB/s of bandwidth on a 1024-bit bus, yielding the package totals of 192 GB of HBM3e and 8 TB/s of bandwidth.[16] Memory capacity per GPU was a major leap over Hopper, where the H100 carried 80 GB HBM3 and the H200 carried 141 GB HBM3e.[14]
The B100 uses NVIDIA's 5th-generation NVLink, providing 1.8 TB/s of bidirectional bandwidth per GPU, double Hopper's NVLink 4 bandwidth, for high-speed GPU-to-GPU communication within an HGX system.[16][14] An 8-GPU HGX B100 baseboard could therefore aggregate 14.4 TB/s of NVLink bandwidth, matched by an NVSwitch fabric on the baseboard.
Two server-level form factors were specified for the B100:
Cloud providers including Amazon Web Services, Google Cloud, Microsoft Azure, Oracle Cloud Infrastructure, and Lambda Labs all announced plans on March 18, 2024 to host Blackwell-based instances, with AWS specifically naming "EC2 instances featuring the new B100 GPUs deployed in EC2 UltraClusters" in its joint announcement with NVIDIA.[19] However, several of those announcements, Lambda Labs in particular, concretely listed only B200 and GB200 instances in their actual on-demand and reserved offerings, with B100 mentioned only descriptively rather than as a planned SKU.[20]
NVIDIA's initial March 2024 messaging cast the air-cooled 700 W B100 as the first Blackwell variant to ship.[4] SemiAnalysis, writing in early April 2024, described the B100 as targeted at ramping first, with the B200 "very soon after" and the liquid-cooled GB200 NVL72 rack-scale system later in the year.[4] TrendForce on August 7, 2024 still expected B100 and B200 shipments to "commence after 3Q24" for cloud-service-provider customers, with broader OEM availability in 2025.[21]
In early August 2024, multiple outlets reported a Blackwell design flaw that required NVIDIA to perform a mask respin on the package, pushing volume shipments by at least a quarter.[22][23] SemiAnalysis attributed the root cause to NVIDIA's first high-volume use of TSMC's CoWoS-L packaging: embedding multiple silicon bridges in the RDL interposer caused a coefficient-of-thermal-expansion (CTE) mismatch between the GPU dies, LSI bridges, RDL interposer, and motherboard substrate, leading to substrate warpage and broken connections.[7][24] The top global routing metal layers and the Blackwell die's bump-out had to be redesigned.[7][24]
On NVIDIA's October 2024 earnings cadence, CEO Jensen Huang publicly accepted responsibility: "We had a design flaw in Blackwell. It was functional, but the design flaw caused the yield to be low. It was 100% NVIDIA's fault."[25] He credited TSMC with helping NVIDIA recover yields and resume manufacturing.[25]
On August 22, 2024, KeyBanc Capital Markets analyst John Vinh issued a client note stating that "given the Blackwell delay, we believe NVIDIA will prioritize the ramp of B200 for hyperscalers and has effectively canceled B100, which will be replaced with a lower cost/performance GPU (B200A) targeted at enterprise customers."[5][6] Vinh's note was widely reported by financial and trade press over the following week.[6]
Concurrent supply-chain reporting from SemiAnalysis was even more explicit at the product-line level: "HGX form-factors with the B100 and B200 are effectively now being cancelled outside of some initial lower volumes," with NVIDIA instead focusing what limited CoWoS-L capacity it had "almost entirely on GB200 NVL 36x2 and NVL72 rack-scale systems."[7] The B200A, based on a single-die "B102" SKU using CoWoS-S packaging with 144 GB of HBM3e and the same 700 W envelope as the B100, was positioned as the de-facto replacement for both HGX B100 and HGX B200 in 8-GPU server form factors, with availability targeted at the second half of 2025.[7][6]
By late 2024, NVIDIA's recovered Blackwell capacity flowed predominantly into GB200 NVL36 and NVL72 rack-scale systems. Tom's Hardware reported that the first Blackwell server deliveries began the first week of December 2024, with Microsoft receiving one of the largest initial allocations, followed by Oracle, AWS, and Meta.[8] Foxconn and Quanta were the primary system integrators for GB200 racks.[8] Industry estimates compiled in late 2024 projected 150,000–200,000 Blackwell GB200 servers shipped in Q4 2024 alone, with 500,000–550,000 units in Q1 2025, essentially none of that volume being HGX B100 SKUs.[26]
By the time Blackwell Ultra (B300) and the GB300 NVL72 became NVIDIA's leading data-center platform in 2025, Supermicro confirmed it was shipping HGX B300 systems and GB300 NVL72 racks in volume, while the HGX B100 and HGX B200 8-GPU products had been quietly de-emphasized.[27]
As of early 2026, public cloud-GPU rental aggregators listed only a single provider claiming on-demand access to B100 instances, and no major public on-demand pricing for B100 was advertised by the largest cloud service providers.[28] In practice, the B100 became a niche enterprise SKU that some specialist providers continued to list but that never anchored a major hyperscaler deployment.
| GPU | TDP | Memory | FP4 dense | FP4 sparse | Form factor | Volume trajectory |
|---|---|---|---|---|---|---|
| B100 | 700 W | 192 GB HBM3e | 7 PFLOPS | 14 PFLOPS | HGX B100 (SXM6) | Effectively de-prioritized after Aug 2024[5][7] |
| B200 | 1,000 W | 192 GB HBM3e | 9 PFLOPS | 18 PFLOPS | HGX B200 (SXM6) | Limited HGX volume; volume concentrated in GB200[7][8] |
| GB200 (per GPU) | up to 1,200 W | 192 GB HBM3e | 10 PFLOPS | 20 PFLOPS | GB200 NVL36 / NVL72 rack | Primary Blackwell shipping product Q4 2024–2025[8][11] |
| B200A (planned) | 700 W / 1,000 W variants | 144 GB HBM3e | N/A | N/A | HGX 8-GPU using CoWoS-S | Replacement for HGX B100/B200 in 2H 2025[7][6] |
Sources for the B100 / B200 / GB200 comparison are NVIDIA's GTC 2024 materials and follow-on independent reporting; the GB300 (Blackwell Ultra) succeeded both at the top of the lineup in 2025.[27][29]
Initial reception of the Blackwell announcement on March 18, 2024 was overwhelmingly positive for the architecture as a whole: press coverage emphasized 5x H100 AI performance per GPU, 192 GB of HBM3e, and a new FP4 data type for trillion-parameter inference.[1][11] The Register noted that with Blackwell, "Nvidia is turning up the AI heat," characterizing the lineup, including the 700 W B100, as a clear generational leap.[11]
Within roughly five months, however, the narrative around the specific B100 SKU had shifted from "the first Blackwell to ship" to "effectively canceled." Trade press, financial analysts, and supply-chain researchers converged on the assessment that the combination of CoWoS-L yield issues, finite TSMC packaging capacity, and overwhelming hyperscaler demand for rack-scale GB200 had collapsed the original three-tier product strategy (B100 / B200 HGX / GB200 rack) down to a one-tier rollout focused on GB200 rack-scale systems.[7][5][6] The B200A and the subsequent Blackwell Ultra refresh (B300/GB300) absorbed the demand the B100 had originally been designed to serve.[27][7]
Independent benchmarking sites and cloud-GPU comparison guides published throughout 2025 and into 2026 routinely observed that "the industry as a whole skipped over B100s to go straight to their more powerful siblings" and that "B100s are difficult to find" in commercial deployments.[14]