NVIDIA HGX

20 min read

Updated Jul 28, 2026

Suggest edit History Talk21 citations Fact-checked Jul 28, 2026

NVIDIA HGX is a family of accelerated-server platform designs built around tightly connected data-center GPUs. It is not one immutable board specification. HGX-1 was introduced as an eight-GPU expansion chassis, HGX-2 could join two eight-GPU baseboards into a 16-GPU system, and later generations have included four-GPU and eight-GPU baseboards. NVIDIA's current product page describes HGX B200, B300, and Rubin as single baseboards carrying eight SXM GPU modules.^[1] Across the family, NVLink carries supported GPU-to-GPU traffic, while most eight-GPU designs since HGX-2 use NVSwitch chips to create a switched local fabric.

HGX specifies the accelerator complex and an integration boundary rather than every part of a finished server. NVIDIA supplies platform designs or assemblies, and a system builder integrates the host processors, system memory, storage, enclosure, power, cooling, firmware, management, and other components. That boundary changes by generation. Most x86 HGX systems leave the CPU and memory to the original equipment manufacturer, HGX B300 adds per-GPU ConnectX-8 network interfaces to the baseboard, and HGX Vera Rubin specifies a Vera CPU and its memory.^[1]^[12]

Scope and terminology

The exact physical unit associated with HGX must be taken from the generation-specific documentation. Calling the entire family either a complete server or a single printed circuit board is inaccurate across its history. The term "HGX server" usually refers to a finished system that contains an HGX accelerator platform, not to the baseboard alone.

NVIDIA uses "SXM" as the module designation for the high-power GPU packages mounted on modern HGX baseboards. Public NVIDIA material reviewed for this article does not define SXM as an expansion of an acronym, so this article does not assign one. SXM modules and PCI Express add-in cards can use the same GPU architecture, but they are different system forms. In HGX, the host reaches GPUs and fabric-management devices through PCI Express, while NVLink carries supported peer traffic inside the GPU complex.^[5]^[7]

NVLink and NVSwitch generation numbers describe related but different components. NVIDIA's current product table labels Hopper, Blackwell, and Rubin links as fourth-, fifth-, and sixth-generation NVLink. The Fabric Manager guide separately calls the NVSwitch chips used with H100 and B200/B300 third- and fourth-generation switches.^[1]^[7] A phrase such as "fifth-generation NVLink" therefore does not establish that the associated switch ASIC is a fifth-generation NVSwitch.

Product boundary

Accelerator complex and host system

In a modern eight-GPU HGX system, the baseboard contains the SXM modules, their local high-bandwidth memory, NVLink wiring, NVSwitch devices, and supporting electrical infrastructure. The complete server supplies host processors and memory, storage, power, cooling, enclosure, firmware, service access, and external network connections. NVIDIA's HGX A100 integration guidance explicitly assigned the CPU subsystem, networking, storage, power, form factor, and node management to the system partner.^[5]

This division does not make partner systems identical. CPU count, PCIe topology, system memory, network adapters, storage layout, chassis size, cooling method, firmware, and support terms can differ between products built around the same HGX generation. NVIDIA's certification program evaluates named complete systems and defines reference configurations with CPU, GPU, adapter, and network-bandwidth counts rather than treating the accelerator board as the full machine.^[16]

The boundary has moved toward greater integration in some recent designs. NVIDIA's enterprise reference architecture places eight ConnectX-8 SuperNICs on the HGX B300 baseboard for east-west traffic, one per GPU. The same document still assigns host CPUs, at least 2 TB of system memory, local storage, a north-south DPU, PCIe layout, security hardware, and remote management to the complete system.^[12] HGX Vera Rubin is another exception to a simple "GPU board only" description: NVIDIA lists a configuration with eight Rubin SXM modules, one NVIDIA Vera CPU, and LPDDR5X CPU memory, alongside an x86 HGX Rubin NVL8 whose CPU and memory remain OEM-defined.^[1]

Scale-up and scale-out paths

NVLink is the local scale-up interconnect. Direct-link four-GPU boards route NVLink between GPUs without an on-board switch. NVSwitch-based boards terminate multiple NVLinks at switch ASICs so that the GPU set can communicate through a switched fabric. PCIe remains the host attachment and can also connect network and storage devices. Communication between separate conventional HGX servers normally uses network interfaces and a scale-out fabric; an external NVLink Switch system is a different topology that can extend an NVLink domain beyond one baseboard.^[8]^[13]

Published bandwidth needs an explicit boundary. NVIDIA generally reports the sum of a GPU's bidirectional NVLink capacity, such as 900 GB/s for H100, 1.8 TB/s for Blackwell, and a preliminary 3.6 TB/s for Rubin.^[1]^[8]^[13] Per-GPU bandwidth, board total, switch aggregate, bisection bandwidth, one-directional bandwidth, and measured application throughput are not interchangeable.

History and architectural generations

Design	Public introduction	Physical organization	Local interconnect	Published interconnect limit
HGX-1	2017	Eight P100 GPUs in an expansion chassis	NVLink hybrid cube mesh plus PCIe switching; no NVSwitch	Topology-dependent paths.^[2]^[3]
HGX-2	2018	One eight-V100 baseboard or two joined baseboards	Six first-generation NVSwitch chips per board	Up to 16 GPUs in one fabric; 300 GB/s bidirectional per GPU.^[3]^[4]
HGX A100	2020	Four-GPU direct-link board or eight-GPU switched board; two eight-GPU boards could be joined	Six second-generation NVSwitch chips on an eight-GPU board	600 GB/s bidirectional per A100 on the switched board.^[5]^[6]^[7]
HGX H100 and H200	2022 and 2023	Four-GPU direct-link board or eight-GPU switched board	Four third-generation NVSwitch chips on an eight-GPU board	900 GB/s bidirectional per GPU; no two-baseboard NVLink mode on the eight-GPU board.^[7]^[8]^[9]
HGX B200 and B300	2024 and 2025	Eight-GPU baseboard	Two fourth-generation NVSwitch ASICs using fifth-generation NVLink	1.8 TB/s bidirectional per GPU; 14.4 TB/s published board total.^[1]^[7]^[10]^[11]
HGX Rubin NVL8	2026	Eight-GPU baseboard with an x86 host design or a Vera CPU configuration	Sixth-generation NVLink and NVLink Switch	Preliminary 3.6 TB/s bidirectional per GPU and 28.8 TB/s switch bandwidth.^[1]^[13]^[14]

The table summarizes physical organization and published interconnect limits. It is not a performance ranking. GPU count, memory technology, numerical format, sparsity assumptions, interconnect accounting, and whether a figure covers a GPU, board, node, or rack all change between generations.

HGX-1

NVIDIA, Microsoft, and Ingrasys announced HGX-1 on March 8, 2017. The open-source design placed eight Tesla P100 GPUs in a chassis and combined NVLink with PCIe switching so that a host could attach to different GPU groupings. It was released with Microsoft's Project Olympus contribution to the Open Compute Project.^[2]

HGX-1 did not use NVSwitch. NVIDIA's later HGX-2 technical description characterized the earlier local fabric as a hybrid cube mesh. Some GPU pairs had one or two NVLink paths, while other communication could traverse PCIe; this was not the uniform switched fabric introduced with HGX-2.^[3] NVIDIA's launch comparisons with CPU servers and ATX designs were vendor positioning tied to 2017 workloads, so they are not used here as architectural specifications.

HGX-2

HGX-2, introduced in May 2018, was the first HGX design built around NVSwitch. Each baseboard held eight 32 GB Tesla V100 GPUs and six first-generation NVSwitch chips. Each V100 supplied six NVLinks, one to each switch on its board. A full two-board platform used passive bridge boards to form a 16-GPU domain for which NVIDIA specified 300 GB/s of bidirectional NVLink bandwidth per GPU.^[3]

The data sheet also allowed a manufacturing partner to build an eight-GPU system with one baseboard.^[4] HGX-2 therefore cannot be reduced to either "an eight-GPU board" or "a 16-GPU board" without qualification. The board held eight GPUs, while the named two-board platform joined sixteen. NVIDIA described the optimized baseboard as its contribution and left mechanics, power, cooling, cabling, and other system-level work to partners; DGX-2 was the first finished NVIDIA system built from the platform.^[3]

HGX A100

The NVIDIA A100 generation in 2020 offered multiple topologies. The four-GPU baseboard connected its GPUs directly with third-generation NVLink and provided 200 GB/s of bidirectional peer bandwidth. The eight-GPU baseboard used six second-generation NVSwitch chips, with twelve NVLink ports per GPU and two links routed to each switch. NVIDIA specified 600 GB/s of bidirectional NVLink bandwidth per GPU on the switched design.^[5]^[6]^[7]

Two eight-GPU A100 baseboards could be joined through switch-to-switch links to form a 16-GPU fabric. That ability did not continue in the same form on H100 or Blackwell eight-GPU baseboards.^[7] The A100 documentation also illustrates why the component generation numbers should be kept separate: the same platform used third-generation NVLink and second-generation NVSwitch.^[6]

HGX H100 and H200

The eight-GPU NVIDIA H100 baseboard reduced the switch count from six to four third-generation NVSwitch chips while increasing each GPU's published bidirectional NVLink capacity to 900 GB/s. Every H100 connects to all four switches, but the physical allocation is asymmetric across switch pairs: the Fabric Manager guide documents four links to each of two switches and five links to each of the other two. The board provides switched connectivity among eight GPUs but does not support the A100 generation's two-baseboard NVLink connection.^[7]^[8]

NVIDIA also described a four-GPU direct-link H100 board and a separate H100 option that could attach to an external NVLink Switch system. The latter was announced with support for domains of up to 256 GPUs, but that external fabric is not the same as the four on-board switches in the ordinary eight-GPU baseboard.^[8]

NVIDIA H200 retained the Hopper platform topology while moving to 141 GB of HBM3e memory per GPU at a published 4.8 TB/s. NVIDIA offered four- and eight-way HGX H200 boards and described them as hardware- and software-compatible with HGX H100 systems.^[9] That statement applies to the H100-to-H200 transition. It does not establish that later generations are drop-in upgrades for every chassis, cooling design, firmware stack, or host platform.

HGX B200 and B300

Blackwell changed the eight-GPU board to two fourth-generation NVSwitch ASICs. The Fabric Manager guide says that each NVIDIA B200, B300, or B100 GPU connects nine links to each switch, for eighteen links per GPU, and that this generation does not support two-baseboard NVLink connections.^[7] NVIDIA specified 1.8 TB/s of bidirectional fifth-generation NVLink bandwidth per GPU. Its current HGX page gives 14.4 TB/s as the eight-GPU board total and marks B200 and B300 as shipping.^[1]^[10]

NVIDIA announced Blackwell Ultra in March 2025 using the name "HGX B300 NVL16 system." Current product documentation and the May 2026 enterprise reference architecture instead describe an HGX B300 baseboard with eight B300 GPUs.^[1]^[11]^[12] The launch suffix should not be read as proof of sixteen physical GPU packages on the current baseboard.

B300 also changes the networking boundary. Its baseboard integrates eight ConnectX-8 interfaces for east-west traffic. A complete certified system still adds a separate north-south DPU and OEM-defined host, memory, storage, security, management, power, and cooling components.^[12]

Two current official NVIDIA documents disagree on B300 memory capacity. The HGX product page lists 2.1 TB of total GPU memory, while the enterprise reference architecture lists 288 GB per GPU and 2.30 TB per eight-GPU node.^[1]^[12] Eight times 288 GB is 2,304 GB, but the documents do not explain the smaller published total. This article therefore does not choose one value or silently reconcile the difference. An exact purchase configuration should be checked against its model-specific data sheet and supplier bill of materials.

HGX Rubin NVL8

NVIDIA announced the Rubin platform in January 2026 and identified HGX Rubin NVL8 as an eight-GPU server board for x86 systems. Its current HGX page also lists HGX Vera Rubin NVL8 with one Vera CPU. These eight-GPU configurations are distinct from the 72-GPU Vera Rubin NVL72 rack-scale system.^[1]^[14]

NVIDIA's current specification table gives each Rubin GPU a preliminary 3.6 TB/s of bidirectional sixth-generation NVLink capacity and the eight-GPU platform 28.8 TB/s of NVLink switch bandwidth. The same table labels all Rubin values as preliminary maximums subject to change, and its comparative workload charts are projected rather than measured shipping-system results.^[1]^[13]

Availability language also needs a date and a narrow interpretation. On May 31, 2026, NVIDIA said that supply-chain partners were manufacturing Vera Rubin systems at scale, while the release's availability section said production shipments were set to begin in fall 2026. It also warned that many described products and features remained subject to availability and change.^[15] At the July 28, 2026 research cutoff, detailed Rubin specifications should therefore remain labeled preliminary, and the production ramp should not be turned into a claim of general customer availability.

Relationship to DGX, MGX, and rack-scale NVL

NVIDIA uses several overlapping platform brands. They are best separated by the unit being specified rather than treated as permanently exclusive layers.

Name	Primary unit	Who completes the system	What the name alone does not establish
HGX	Accelerator platform, baseboard, or generation-specific reference configuration	NVIDIA plus an OEM, ODM, cloud operator, or NVIDIA's own system organization	Exact host CPU, storage, chassis, cooling, firmware, and support package
NVIDIA DGX	Finished NVIDIA-branded system with software and support	NVIDIA	That an OEM server using the same GPU has the DGX bill of materials
MGX	Modular server and rack reference architecture	OEM and ODM partners	One fixed GPU count, CPU family, interconnect, or physical form
Rack-scale NVL	Multi-tray or full-rack NVLink system	NVIDIA and system partners, depending on product	That the product is an eight-GPU HGX baseboard

DGX is a complete NVIDIA product. The DGX B200 guide, for example, specifies eight B200 GPUs, two Intel Xeon CPUs, system memory, NVMe storage, ConnectX and BlueField networking, a management controller, six power supplies, a 10U enclosure, DGX software, and NVIDIA support. It gives that particular finished system a maximum power figure of about 14.3 kW.^[17] Those components, dimensions, power requirements, and support terms are not implied merely by the phrase "HGX B200."

MGX is broader and more modular. NVIDIA's current description covers varied NVIDIA, x86, and other Arm CPUs; PCIe and rack-scale GPU options; networking; power; cooling; and common rack designs.^[18] An MGX-based system can overlap other NVIDIA component families. MGX is not simply a smaller HGX, and HGX is not a component of every MGX configuration.

Rack-scale NVL systems use a different physical boundary. The NVIDIA GB200 NVL72, for example, joins 36 Grace CPUs and 72 Blackwell GPUs in a liquid-cooled rack with a 72-GPU NVLink domain.^[19] It is not an eight-GPU HGX board, even though both products appear in NVIDIA's data-center portfolio and use NVLink technology.

OEM integration, certification, and lifecycle

NVIDIA's certified-system inventory documents the intended multi-vendor model. Its HGX reference-configuration names encode CPU count, GPU count, network-adapter count, and average GPU network bandwidth. At the research cutoff, the 2-8-9-400 configuration covered H100 and H200 systems, while 2-8-9-800 and 2-8-10-800 covered B200 and B300 systems. The inventory names individual partner servers and supported network devices.^[16]

Certification narrows a configuration to tested combinations, but it does not make every certified system identical. Buyers still need the exact server documentation for:

host CPU model, socket count, system memory, and PCIe placement;
air or liquid cooling, power limits, rack density, and facility requirements;
local and remote storage paths;
east-west compute networking and north-south service or storage networking;
firmware, GPU driver, operating system, virtualization, and management support;
warranty, field service, and software-support responsibility.

Cloud instance names require the same caution. An instance advertised by GPU model and count is not, by that description alone, proven to use a particular HGX board. The provider's device inventory, topology, and system documentation control that identification.

Generation names also do not guarantee in-place upgradeability. H100-to-H200 compatibility was expressly documented, but later HGX generations change GPU modules, NVLink and NVSwitch generations, host interfaces, networking, firmware, driver requirements, and thermal design.^[7]^[9] Any upgrade claim needs an OEM-supported path for the exact system. The product lifecycle should likewise distinguish an announcement, partner qualification, manufacturing ramp, production shipment, and general availability rather than treating them as synonyms.

Fabric initialization and management

NVSwitch hardware requires a control plane. NVIDIA Fabric Manager is a privileged service used on NVSwitch-based HGX and DGX systems. It coordinates with the GPU driver, configures routing and GPU port maps where applicable, initializes supported links, and monitors NVLink and NVSwitch errors.^[7]

The Blackwell control plane adds NVLink Subnet Manager, or NVLSM. NVIDIA documents NVLSM as discovering the NVLink topology, assigning local identifiers, calculating and programming switch forwarding tables, programming partition keys, and monitoring fabric changes. Fabric Manager remains responsible for GPU-side routing, NVLink configuration, driver coordination, and partition-management interfaces.^[7] B200 and B300 systems use fourth-generation NVSwitches based on the NVLink 5 protocol and require the additional NVLSM service.

Fabric Manager, NVLSM where required, and the data-center GPU driver must be compatible. NVIDIA's public guide lists different minimum driver branches for HGX-2 and A100, H100, and B200/B300 systems.^[7] Exact package names and supported versions change, so an operator should follow the guide and OEM support matrix for the deployed generation rather than copy a package version from a static encyclopedia entry.

The management software also supports generation-specific degraded-operation policies for some link, GPU, or switch failures. Depending on platform and policy, Fabric Manager can remove affected partitions or retain partitions at reduced fabric bandwidth.^[7] This does not support a general claim that every HGX system can continue through any switch or link failure.

Partitioning and virtualization

An NVSwitch fabric can be divided into approved groups of whole GPUs for virtual-machine or multi-tenant deployment. NVIDIA documents bare-metal or full-passthrough mode, shared NVSwitch mode, and virtual-GPU mode. In shared mode, a trusted service virtual machine manages the fabric while guest virtual machines receive assigned GPUs; the switch fabric is shared but is not directly exposed to those guests.^[7]

Default H100, H200, B200, and B300 partition tables include eight-, four-, two-, and one-GPU groupings. A one-GPU partition has its NVLinks disabled because it has no peer GPU. Multi-GPU partitions retain only the links and switch resources assigned to that group.^[7] These are fabric partitions across physical GPUs. They are different from subdividing the memory and compute resources of one GPU with Multi-Instance GPU.

Partition availability depends on the generation, topology, deployment mode, driver, and fabric-management software. A list of default partitions is not a promise that every OEM firmware combination exposes every virtualization arrangement. Operators need the current Fabric Manager guide and the system vendor's qualification matrix.

Software and operations

Fabric initialization is only one layer of a working multi-GPU system. NCCL provides collective and point-to-point communication primitives and uses topology information across transports that include PCIe, NVLink, InfiniBand, and IP sockets.^[20] A working NVSwitch fabric is necessary for the intended local topology, but it does not choose the best collective algorithm, repair a scale-out network, or guarantee application scaling.

Operational qualification should align the server firmware, GPU driver, Fabric Manager, NVLSM where applicable, NCCL, operating system, virtualization layer, and OEM support matrix. CPU-to-GPU PCIe placement, network-interface locality, storage paths, power limits, cooling, and service procedures can affect delivered performance and availability even when the same HGX baseboard is present.^[5]^[12]

Independent measurements also show why a nameplate rate is not an application benchmark. A peer-reviewed study evaluated PCIe, two NVLink generations, NV-SLI, NVSwitch, and GPUDirect across six earlier multi-GPU systems, including DGX-2. It found communication effects tied to topology, connectivity, routing, GPU selection, PCIe layout, message size, and the communication primitive.^[21] The paper does not benchmark current HGX generations, but it supports the narrower point that measured behavior cannot be inferred from one aggregate link number.

Reading specifications and limitations

HGX comparisons should keep several qualifiers attached to every number:

Qualifier	Questions to resolve
Unit	Does the value cover one GPU, a baseboard, a complete node, or a rack?
Direction and aggregation	Is it one-way, bidirectional, per GPU, board total, bisection, or switch aggregate?
Arithmetic convention	Is the rate dense or sparse, and which numerical format is being counted?
Evidence type	Is it a component peak, projected workload comparison, or measured application result?
Document date	Does it come from a launch announcement, current product page, reference architecture, or supplier bill of materials?

Sparse and dense tensor-compute figures cannot be compared without their assumptions. FP4, FP8, BF16, TF32, FP32, and FP64 rates describe operations with different precision and semantics. The current Rubin table is preliminary, and the current B300 sources disagree on aggregate memory.^[1]^[12] This article therefore uses only bounded interconnect figures to describe architecture and does not translate them into unsupported claims about training time, inference throughput, cost, or energy use.

HGX reduces the accelerator-complex engineering left to a server builder, but it also fixes generation-specific choices inside that assembly. GPU count, module type, local fabric, and board interfaces cannot be customized as freely as a collection of independent PCIe cards. The local NVLink domain also does not remove the scale-out boundary. Multi-node jobs still depend on network adapters, switches, routing, congestion control, collective software, storage, and orchestration.

Aggregate high-bandwidth memory remains distributed across GPU packages; it is not automatically one coherent CPU memory pool. Software must partition data and schedule communication. Power and cooling also belong to the complete server and facility design, not to an isolated bandwidth figure. A chassis, power cap, thermal limit, workload shape, or software version can keep measured results below a component peak.

Finally, the cited material establishes a product family, a technical control plane, and a multi-vendor certification program. It does not establish a fixed HGX share of global AI compute or prove that every eight-GPU server or cloud instance uses HGX. Market-share, price, and total-cost claims need separately scoped, dated evidence and are not inferred from NVIDIA's partner lists.

References

^NVIDIA. "NVIDIA HGX Platform." Accessed July 28, 2026. nvidia.com/...hgx
^NVIDIA Newsroom. "NVIDIA and Microsoft Boost AI Cloud Computing With Launch of Industry-Standard Hyperscale GPU Accelerator." March 8, 2017. nvidianews.nvidia.com/...yperscale-gpu-accelerator
^Tsu, William. "HGX-2 Fuses HPC and AI Computing Architectures." NVIDIA Technical Blog, May 29, 2018. developer.nvidia.com/...hgx-2-fuses-ai-computing
^NVIDIA. "NVIDIA HGX-2" data sheet. May 2018. nvidia.com/...hgx2-datasheet.pdf
^Tsu, William. "Introducing NVIDIA HGX A100: The Most Powerful Accelerated Server Platform for AI and High Performance Computing." NVIDIA Technical Blog, May 14, 2020. developer.nvidia.com/...server-platform-for-ai-hpc
^NVIDIA. "NVIDIA HGX A100" data sheet. August 2020. nvidia.com/...nvidia-hgx-a100-datasheet.pdf
^NVIDIA. "NVIDIA Fabric Manager User Guide." Accessed July 28, 2026. docs.nvidia.com/...fabric-manager-user-guide
^Tsu, William. "Introducing NVIDIA HGX H100: An Accelerated Server Platform for AI and High-Performance Computing." NVIDIA Technical Blog, April 21, 2022. developer.nvidia.com/...high-performance-computing
^NVIDIA Newsroom. "NVIDIA Supercharges Hopper, the World's Leading AI Computing Platform." November 13, 2023. nvidianews.nvidia.com/...ing-ai-computing-platform
^NVIDIA Newsroom. "NVIDIA Blackwell Platform Arrives to Power a New Era of Computing." March 18, 2024. nvidianews.nvidia.com/...er-a-new-era-of-computing
^NVIDIA Newsroom. "NVIDIA Blackwell Ultra AI Factory Platform Paves Way for Age of AI Reasoning." March 18, 2025. nvidianews.nvidia.com/...y-for-age-of-ai-reasoning
^NVIDIA. "Components." NVIDIA HGX AI Factory Enterprise Reference Architecture. Updated May 18, 2026. docs.nvidia.com/...components
^NVIDIA. "NVIDIA NVLink and NVLink Switch." Accessed July 28, 2026. nvidia.com/...nvlink
^NVIDIA Newsroom. "NVIDIA Kicks Off the Next Generation of AI With Rubin: Six New Chips, One Incredible AI Supercomputer." January 5, 2026. nvidianews.nvidia.com/...platform-ai-supercomputer
^NVIDIA Newsroom. "NVIDIA Vera Rubin Ramps Into Full Production to Power Agentic AI Factories Worldwide." May 31, 2026. nvidianews.nvidia.com/...uction-agentic-ai-factory
^NVIDIA. "NVIDIA-Certified Systems." Certification Programs documentation. Accessed July 28, 2026. docs.nvidia.com/...nvidia-certified-systems
^NVIDIA. "Introduction to NVIDIA DGX B200 Systems." DGX B200 User Guide. Updated June 29, 2026. docs.nvidia.com/...introduction-to-dgxb200
^NVIDIA. "NVIDIA MGX." Accessed July 28, 2026. nvidia.com/...mgx
^NVIDIA. "NVIDIA GB200 NVL72." Accessed July 28, 2026. nvidia.com/...gb200-nvl72
^NVIDIA. "Overview of NCCL." NCCL 2.30 documentation. Accessed July 28, 2026. docs.nvidia.com/...overview
^Li, Ang, Shuaiwen Leon Song, Jieyang Chen, Jiajia Li, Xu Liu, Nathan R. Tallent, and Kevin J. Barker. "Evaluating Modern GPU Interconnect: PCIe, NVLink, NV-SLI, NVSwitch and GPUDirect." IEEE Transactions on Parallel and Distributed Systems 31, no. 1 (2020): 94-110. doi.org/...TPDS.2019.2928289

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

1 revision by 1 contributors · v2 · 3,966 words · full history

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Reviewer note: Independent full fact-check completed 2026-07-28 against official NVIDIA product, topology, fabric-management, certification, and system documentation plus peer-reviewed interconnect research; duplicate scope, preliminary Rubin status, and conflicting B300 memory figures were separately reviewed.

Suggest edit

What links here

Cloud AI GPU Pricing Comparison NVIDIA A100 NVIDIA A800 NVIDIA DGX B300 NVIDIA GB200 NVL72 NVIDIA H200 PCI Express Quanta Computer xAI Colossus

Scope and terminology

Product boundary

Accelerator complex and host system

Scale-up and scale-out paths

History and architectural generations

HGX-1

HGX-2

HGX A100

HGX H100 and H200

HGX B200 and B300

HGX Rubin NVL8

Relationship to DGX, MGX, and rack-scale NVL

OEM integration, certification, and lifecycle

Fabric initialization and management

Partitioning and virtualization

Software and operations

Reading specifications and limitations

References

Improve this article

Related Articles

NVIDIA B200

NVIDIA GB300 NVL72

NVIDIA DGX B300

NVIDIA Spectrum-6

NVIDIA DGX SuperPOD

NVIDIA A100

What links here

Related Articles

NVIDIA B200

NVIDIA GB300 NVL72

NVIDIA DGX B300

NVIDIA Spectrum-6

NVIDIA DGX SuperPOD

NVIDIA A100

What links here