NVIDIA DGX
Last reviewed
Jun 10, 2026
Sources
19 citations
Review status
Source-backed
Revision
v1 · 1,795 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 10, 2026
Sources
19 citations
Review status
Source-backed
Revision
v1 · 1,795 words
Add missing citations, update stale details, or suggest a clearer explanation.
NVIDIA DGX is a line of integrated artificial-intelligence supercomputers built by nvidia. Each DGX system packages NVIDIA's data-center GPUs, CPUs, high-speed interconnects, storage, and a preconfigured software stack into a single appliance designed to run deep-learning training and inference workloads out of the box. First introduced in 2016, the DGX brand has grown from a single 8-GPU server into a portfolio that spans rack-scale machines, scale-out reference architectures, a cloud service, and deskside personal systems. [1]
The defining idea behind DGX is the "AI supercomputer in a box": rather than asking customers to assemble GPUs, networking, and software themselves, NVIDIA sells a turnkey system that is validated and supported as a unit. DGX systems are built around NVIDIA's highest-end GPUs of each generation and use the company's proprietary nvlink interconnect, and later the NVLink Switch fabric, to let the GPUs share memory and exchange data at bandwidths far higher than standard PCIe. They ship with NVIDIA's enterprise AI software, including the optimized CUDA libraries and container stack, and are sold to enterprises, research labs, and cloud providers. [1][2]
DGX should be distinguished from NVIDIA's related HGX platform. HGX is a GPU baseboard that NVIDIA sells to server makers, who build their own systems around it; DGX is NVIDIA's own fully assembled, NVIDIA-branded product that uses the same underlying GPU boards. Many DGX generations share their baseboard design with the corresponding HGX module. [2]
The original DGX-1 was announced in April 2016 and built around eight Pascal-generation nvidia_p100 GPUs, delivering about 170 teraflops of half-precision (FP16) performance for roughly 129,000 US dollars. In August 2016, NVIDIA chief executive Jensen Huang personally hand-delivered the first production unit to openai at its San Francisco office, signing the chassis and presenting it to the OpenAI team. The donation has since become an emblem of the early deep-learning era. In 2017, at NVIDIA's GPU Technology Conference, the DGX-1 was refreshed with eight Volta-generation nvidia_v100 GPUs, raising performance to roughly 960 FP16 teraflops at a list price near 149,000 US dollars; buyers of the Pascal version were offered a free upgrade to V100 boards. [1][3][4]
NVIDIA followed in March 2018 with the DGX-2, which it billed as "the world's largest GPU." It combined sixteen 32 GB V100 GPUs across two baseboards, joined by the first-generation NVSwitch fabric so that all sixteen GPUs could communicate as a single memory pool. The DGX-2 reached about 2 petaflops of deep-learning performance and listed at 399,000 US dollars. [1][5]
The third generation, the DGX A100, launched on 14 May 2020 with eight Ampere-architecture nvidia_a100 GPUs, 320 GB of total GPU memory, and around 5 petaflops of AI performance, at a starting price of about 199,000 US dollars. The fourth generation, the DGX H100, was announced in March 2022 and used eight Hopper-architecture nvidia_h100 GPUs with 640 GB of HBM3 memory, delivering 32 petaflops at the new FP8 precision, roughly six times the prior generation. It also introduced ConnectX-7 400 Gb/s InfiniBand networking and a pair of BlueField-3 data-processing units. [1][6][7]
With the Blackwell architecture, NVIDIA shifted its top-end DGX toward rack-scale designs. The air-cooled DGX B200, shown in 2024, pairs eight nvidia_b200 GPUs with an x86 host to provide up to 72 petaflops of training and 144 petaflops of inference performance. The flagship became the gb200 NVL72, a liquid-cooled rack that links 36 Grace CPUs and 72 Blackwell GPUs through a single NVLink Switch domain. It exposes about 13.5 TB of unified GPU memory and up to roughly 1.4 FP4 exaflops, with the NVLink Switch System providing 130 TB/s of GPU-to-GPU bandwidth inside the rack. NVIDIA markets the DGX GB200 as a building block in which each unit is a full NVL72 rack, and customers connect many such racks to scale out. In 2025 the line advanced to the DGX GB300, based on the Grace Blackwell Ultra (GB300) superchip. [2][8][9]
| System | Year | GPUs | AI performance | Notable details |
|---|---|---|---|---|
| DGX-1 (Pascal) | 2016 | 8x P100 | ~170 TF (FP16) | First unit donated to OpenAI; ~129,000 USD |
| DGX-1 (Volta) | 2017 | 8x V100 | ~960 TF (FP16) | V100 refresh; ~149,000 USD |
| DGX-2 | 2018 | 16x V100 | ~2 PF | First NVSwitch; 399,000 USD |
| DGX A100 | 2020 | 8x A100 | ~5 PF | 320 GB GPU memory; ~199,000 USD |
| DGX H100 | 2022 | 8x H100 | 32 PF (FP8) | 640 GB HBM3; ConnectX-7, BlueField-3 |
| DGX B200 | 2024 | 8x B200 | 72 PF train / 144 PF infer | Air-cooled Blackwell node |
| DGX GB200 NVL72 | 2024 to 2025 | 72x B200 + 36 Grace | ~1.4 EF (FP4) | Liquid-cooled rack; 13.5 TB unified memory |
| DGX GB300 | 2025 | Blackwell Ultra | (rack-scale) | Grace Blackwell Ultra superchip |
The DGX SuperPOD is NVIDIA's scale-out reference architecture for connecting many DGX systems into a single supercomputer. Rather than a single product, it is a validated blueprint covering compute nodes, InfiniBand and Ethernet networking, management nodes, storage, power, and cooling, sold as a turnkey solution. The design is modular: nodes are grouped into "scalable units," for example 32 DGX H100 nodes per scalable unit, and a SuperPOD can range from a handful of nodes to several thousand. [10][11]
NVIDIA's internal Eos supercomputer is the reference example of a DGX SuperPOD. Revealed in detail in 2024, Eos is built from 576 DGX H100 systems, totaling 4,608 H100 GPUs interconnected with Quantum-2 400 Gb/s InfiniBand, and delivers about 18.4 exaflops of FP8 AI performance. Measured on the double-precision LINPACK benchmark, Eos recorded an Rmax of roughly 121 petaflops, which placed it among the top ten systems on the TOP500 list at the time. NVIDIA uses Eos to develop and benchmark its own models and software. [10][12]
DGX Cloud is a service that rents DGX infrastructure rather than selling the hardware outright. Jensen Huang announced it at GTC in March 2023, positioning it so enterprises could reach an AI supercomputer "from a browser." Crucially, NVIDIA does not run its own data centers for the service; instead it places DGX-configured capacity inside partner clouds and prices the service itself. Oracle Cloud Infrastructure was the first host, with Microsoft Azure and Google Cloud following. At launch, instances were priced at about 36,999 US dollars per month for an eight-GPU node, a premium relative to comparable on-demand GPU instances from the major clouds. [13][14]
NVIDIA later adjusted its cloud strategy. By 2025 the company had stepped back from positioning DGX Cloud as a direct rival to Amazon Web Services and Microsoft Azure, redirecting much of the capacity toward its own internal research and model development. In May 2025 it introduced DGX Cloud Lepton, a marketplace that aggregates GPU capacity from multiple cloud and infrastructure partners and connects developers to it, with NVIDIA acting as an intermediary rather than an owner-operator. [14][15]
Alongside the data-center line, NVIDIA has long offered smaller deskside DGX systems. The original DGX Station, introduced in 2017, was a workstation-form-factor machine with four V100 GPUs, and a DGX Station A100 followed with four A100 GPUs. In the Blackwell generation NVIDIA revived and expanded this category. [1]
At CES in January 2025 the company previewed a compact personal AI computer under the codename Project DIGITS. In March 2025, at its spring GTC event, NVIDIA renamed it dgx_spark and detailed the design. DGX Spark is built around the GB10 Grace Blackwell superchip, co-developed with mediatek, pairing a 20-core Arm CPU with a Blackwell GPU and 128 GB of coherent CPU-GPU LPDDR5X memory, and includes ConnectX-7 200 Gb/s networking. NVIDIA rates it at up to about 1 petaflop of AI performance at low precision and positions it as a desktop machine for prototyping and running models locally. DGX Spark began shipping in October 2025 with a starting price of 3,999 US dollars, sold both directly and through partners. [16][17]
NVIDIA also announced a more powerful deskside system, the DGX Station, built on the GB300 Grace Blackwell Ultra superchip. It combines a 72-core Grace CPU with a Blackwell Ultra GPU over a 900 GB/s NVLink-C2C link, offers 784 GB of coherent memory (252 GB of HBM3e plus 496 GB of LPDDR5X), and is rated at up to roughly 20 petaflops of AI performance, with ConnectX-8 networking for scaling. NVIDIA presented it as a deskside machine capable of working with models up to about a trillion parameters, available to order through partners including Asus, Dell, Gigabyte, HP, MSI, and Supermicro and shipping during 2025. [18][19]
DGX systems have played an outsized role in the modern AI boom relative to their unit volume. The hand-delivered DGX-1 helped seed early research at OpenAI, and the line established the template, integrated GPU, interconnect, and software, that NVIDIA, its server partners, and cloud providers later replicated at enormous scale. The SuperPOD reference architecture turned that template into a repeatable way to stand up frontier-scale training clusters, and DGX hardware doubles as the showcase and validation platform for each new NVIDIA GPU architecture. By extending the brand downward to DGX Spark and the Blackwell-era DGX Station, NVIDIA has sought to put the same software environment used in its largest data-center systems onto individual developers' desks. [1][8][16]