NVIDIA DGX SuperPOD
Last reviewed
Jun 3, 2026
Sources
14 citations
Review status
Source-backed
Revision
v1 · 2,293 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 3, 2026
Sources
14 citations
Review status
Source-backed
Revision
v1 · 2,293 words
Add missing citations, update stale details, or suggest a clearer explanation.
The NVIDIA DGX SuperPOD is a reference-architecture artificial intelligence supercomputer designed and sold by Nvidia. It is a turnkey, factory-validated cluster built from the company's DGX systems, connected by high-speed InfiniBand networking, paired with shared high-performance storage, and managed by NVIDIA's data-center software stack. Rather than asking customers to design a large GPU cluster from scratch, NVIDIA delivers the DGX SuperPOD as a pre-engineered, repeatable blueprint that can be deployed in weeks instead of months. NVIDIA markets it as a complete "AI factory" or "turnkey AI data center solution," combining compute, networking, storage, and software that are tuned to work together at scale.
A defining characteristic of the DGX SuperPOD is that it is built around a modular building block called the scalable unit (SU). Each SU is a fixed group of DGX systems plus the networking and management infrastructure needed to operate them, so that customers can size a deployment predictably by adding units. This design lets organizations grow from a few racks to clusters with tens of thousands of GPUs while keeping the same validated topology, cabling scheme, and software. NVIDIA also uses the DGX SuperPOD architecture to build its own internal supercomputers, most notably Selene (based on DGX A100 systems) and Eos (based on DGX H100 systems), both of which have appeared on the TOP500 list of the world's fastest supercomputers.
A DGX SuperPOD is not a single product with one fixed specification. It is a reference architecture: a documented, supported design that specifies which DGX systems to use, how to wire them together, what storage and management nodes to include, and which software to run. Each generation is published as a formal reference-architecture document by NVIDIA. The intent is to remove the engineering risk and integration effort of standing up a large training cluster, so an enterprise or research lab receives a system that has already been validated to deliver predictable performance at scale.
The core ingredients of every DGX SuperPOD generation are consistent:
Because the whole stack is specified and tested together, NVIDIA positions the DGX SuperPOD as a supported alternative to building a bespoke cluster, with the company providing white-glove installation, support, and lifecycle services.
The scalable unit is the organizing principle of the DGX SuperPOD. An SU bundles a fixed number of DGX nodes together with the leaf-layer InfiniBand switches that connect them, so the cluster can be assembled and grown in well-defined increments. Multiple SUs are then joined through additional spine and, for the largest systems, core switching layers, forming a multi-tier fat-tree fabric.
The size of a scalable unit has changed with each DGX generation:
| Generation | DGX system | GPU per node | Nodes per scalable unit | GPUs per scalable unit |
|---|---|---|---|---|
| DGX A100 SuperPOD | DGX A100 | 8 x A100 | 20 | 160 |
| DGX H100 / H200 SuperPOD | DGX H100 / H200 | 8 x H100 or H200 | 32 | 256 |
| DGX B200 SuperPOD | DGX B200 | 8 x B200 | 32 | 256 |
| DGX GB200 SuperPOD | DGX GB200 (NVL72 rack) | 72 x Blackwell per rack | 8 systems | 576 |
| DGX B300 SuperPOD | DGX B300 | 8 x Blackwell Ultra | 64 | 512 |
For example, in the original DGX A100 SuperPOD, four people could rack a single 20-node scalable unit in about an hour, producing a roughly 2-petaflops building block; a standard SuperPOD of seven such units totaled 140 DGX A100 systems and 1,120 A100 GPUs. The Hopper-generation DGX H100 SuperPOD moved to 32 nodes per SU, giving 256 H100 GPUs per unit, and added an external NVLink Switch System so that GPUs across multiple nodes could share memory bandwidth more tightly. In the Grace Blackwell DGX GB200 SuperPOD, the unit is defined as eight liquid-cooled DGX GB200 rack systems, connecting 576 Blackwell GPUs in a single NVLink domain.
The DGX SuperPOD has tracked NVIDIA's GPU roadmap closely, with each new architecture producing a corresponding SuperPOD reference design.
NVIDIA first introduced the DGX SuperPOD in June 2019. The original system was built from 96 DGX-2H servers containing 1,536 Volta-based V100 GPUs, linked with Mellanox EDR (100 Gb/s) InfiniBand. It delivered roughly 9.4 petaflops on the High Performance Linpack (HPL) benchmark against an 11.2-petaflops theoretical peak, consumed about one megawatt of power, and debuted at number 22 on the June 2019 TOP500 list. NVIDIA reported that it was assembled in only a few weeks and built primarily to support the company's autonomous-vehicle development program, while also running graphics, speech, healthcare, and HPC workloads.
Announced in 2020 alongside the A100 GPU and the Ampere architecture, the DGX A100 SuperPOD formalized the scalable-unit model that the product is known for today. Each DGX A100 node carries eight A100 GPUs and two AMD EPYC 7742 64-core CPUs. A scalable unit was 20 nodes, and a standard SuperPOD comprised seven SUs for 140 nodes and 1,120 GPUs. The compute fabric used Mellanox Quantum HDR 200 Gb/s InfiniBand switches in a rail-optimized fat tree, with separate InfiniBand links for compute and storage, and storage delivered by a high-throughput parallel file system. The A100 reference architecture established the validated cabling, switching, and management patterns that later generations refined.
With the Hopper generation, NVIDIA grew the scalable unit to 32 DGX H100 nodes, yielding 256 H100 GPUs per SU. Each DGX H100 node pairs eight H100 GPUs with two Intel Xeon Platinum 8480C CPUs and ConnectX-7 network adapters running 400 Gb/s NDR InfiniBand on NVIDIA's Quantum-2 switches. The H100 SuperPOD added the NVLink Switch System, extending the in-node NVLink fabric across nodes within a scalable unit. The reference architecture scales from a baseline of four SUs (128 nodes, 1,024 GPUs) up to designs of dozens of units and many thousands of GPUs. A closely related variant uses the memory-enhanced H200 GPU with the same 32-node SU.
The Blackwell-based DGX B200 SuperPOD kept the air-cooled eight-GPU node format and the 32-node scalable unit (256 GPUs per SU), upgrading the GPUs to B200 and refreshing the networking. It serves customers who want a Blackwell-class training and inference cluster in the same physical and operational pattern as the Hopper systems.
At GTC in March 2024, NVIDIA announced the Grace Blackwell DGX SuperPOD built from DGX GB200 systems, a liquid-cooled, rack-scale design for trillion-parameter generative AI. Each DGX GB200 is a rack-scale system in which 36 Grace CPUs and 72 Blackwell GPUs (organized as 36 GB200 Superchips, see GB200 NVL72) are connected as one unit by fifth-generation NVLink. A scalable unit is eight DGX GB200 systems, connecting 576 Blackwell GPUs in a shared NVLink domain, with NVIDIA Quantum InfiniBand joining units into larger systems. NVIDIA stated that a Grace Blackwell DGX SuperPOD provides 11.5 exaflops of AI compute at FP4 precision with 240 terabytes of fast memory in its base configuration, scaling further with additional racks.
At GTC in March 2025, NVIDIA followed with the Blackwell Ultra DGX SuperPOD using DGX GB300 and DGX B300 systems. Each DGX GB300 rack system pairs 36 Grace CPUs with 72 Blackwell Ultra GPUs and includes 38 terabytes of fast memory; the systems connect via NVLink, NVIDIA Quantum-X800 800 Gb/s InfiniBand, and NVIDIA Spectrum-X Ethernet, with 72 ConnectX-8 SuperNICs per system. These Blackwell Ultra deployments are managed by NVIDIA Mission Control software.
NVIDIA validates each DGX SuperPOD generation by building large internal systems of its own, both of which became prominent TOP500 entries.
Selene was NVIDIA's DGX A100 SuperPOD. NVIDIA assembled the first version extremely quickly in mid-2020, and it debuted at number 7 on the June 2020 TOP500 list at about 27.6 petaflops on Linpack. NVIDIA then expanded Selene and re-ran the benchmark: in its full configuration it comprised 560 DGX A100 nodes, with the TOP500-submitted run using 1,080 AMD EPYC CPUs and 4,320 A100 GPUs to reach 63.46 petaflops (against a 79.22-petaflops peak), which placed it at number 5 on the November 2020 list. Selene was also one of the most energy-efficient large systems of its era, ranking highly on the Green500. NVIDIA used Selene for internal research and for record-setting submissions to the MLPerf training and inference benchmarks, and it served as the in-house proof point for the scalable-unit design that NVIDIA sold to customers.
Eos is NVIDIA's DGX H100 SuperPOD, unveiled in 2022 and brought online in 2023. The system that NVIDIA submitted to the TOP500 in November 2023 was built from 576 DGX H100 systems, for a total of 4,608 H100 GPUs joined by Quantum-2 NDR400 InfiniBand, and it achieved 121.4 petaflops on Linpack (against a 188.65-petaflops peak), debuting at number 9. NVIDIA described this Eos as delivering up to 18 exaflops of FP8 AI performance. NVIDIA also operated a larger, separately configured Eos-class system with 10,752 H100 GPUs that it used for headline MLPerf training runs; the existence of two differently sized machines under the Eos name caused some public confusion about its exact specifications. The clarifying detail is that the TOP500 entry and the largest MLPerf submission referred to different physical clusters.
| System | SuperPOD generation | DGX nodes | GPUs | Linpack (Rmax) | Highest TOP500 rank |
|---|---|---|---|---|---|
| 2019 DGX SuperPOD | DGX-2H (Volta V100) | 96 | 1,536 | 9.44 PFlop/s | No. 22 (Jun 2019) |
| Selene | DGX A100 | up to 560 | 4,320 (in benchmark run) | 63.46 PFlop/s | No. 5 (Nov 2020) |
| Eos | DGX H100 | 576 | 4,608 | 121.4 PFlop/s | No. 9 (Nov 2023) |
The DGX SuperPOD is aimed at organizations that need a large, supported AI training cluster but do not want to design, integrate, and tune one themselves. Because the architecture is validated end to end and delivered with NVIDIA installation and support services, it offers a faster and lower-risk path to a TOP500-class system than assembling commodity components. Customers have included national research centers, universities, government agencies, automotive and pharmaceutical companies, and cloud and sovereign-AI operators.
Notable external deployments built on the DGX SuperPOD architecture include NVIDIA's own Cambridge-1, a DGX A100 SuperPOD in the United Kingdom dedicated to healthcare and life-sciences research, and large national and commercial systems in the Middle East, Europe, and Asia. The reference architecture has also been offered through partners and integrated into NVIDIA's broader portfolio, including managed access to SuperPOD-class infrastructure via NVIDIA's cloud offerings.
The DGX SuperPOD has been influential as one of the first widely available, productized blueprints for building large-scale AI supercomputers. By codifying the scalable-unit model, a tested InfiniBand fat-tree fabric, integrated storage, and a unified management stack, NVIDIA turned the previously bespoke task of building an AI training cluster into a repeatable, supported product. Its internal incarnations, Selene and Eos, demonstrated that AI-optimized clusters of GPUs could rank among the most powerful supercomputers in the world while remaining highly energy efficient, and they provided NVIDIA with the platform for its competitive MLPerf results. As generative AI drove demand for ever-larger training systems, the DGX SuperPOD reference architecture became a template that enterprises, governments, and cloud providers used to stand up "AI factories" at scale, and it continues to evolve in lockstep with NVIDIA's GPU roadmap through the Hopper, Blackwell, and Blackwell Ultra generations.