NVIDIA ConnectX
Last reviewed
Jun 3, 2026
Sources
24 citations
Review status
Source-backed
Revision
v1 ยท 2,516 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 3, 2026
Sources
24 citations
Review status
Source-backed
Revision
v1 ยท 2,516 words
Add missing citations, update stale details, or suggest a clearer explanation.
NVIDIA ConnectX is a family of high-speed network adapters and SmartNICs (smart network interface cards) that sit at the edge of a server and connect it to the data center fabric. ConnectX cards are unusual in that a single device can speak two very different high-performance protocols: InfiniBand and Ethernet. That flexibility, branded by the original developer as Virtual Protocol Interconnect (VPI), is one reason the line has dominated high-performance computing and, more recently, the back-end networks that wire together GPU clusters for training and serving large AI models. The family originated at the Israeli networking company Mellanox Technologies and is now sold under NVIDIA Networking following NVIDIA's 2020 acquisition of Mellanox.[1][2]
A ConnectX adapter is far more than a simple Ethernet card. Each generation offloads a growing list of jobs from the host CPU into fixed-function and programmable hardware on the adapter itself: Remote Direct Memory Access (RDMA), congestion control, encryption, precise timing, packet steering, and collective-communication primitives. By doing this work in silicon, the NIC frees CPU cores for application code and slashes the latency of moving data between machines, which matters enormously when thousands of GPUs must exchange gradients and activations during a single training step.[3][4]
A conventional network interface card moves packets between the wire and host memory and lets the operating system's networking stack do everything else. A SmartNIC adds acceleration hardware so that latency-sensitive and CPU-hungry tasks happen on the adapter. ConnectX adapters accelerate, among other things, RDMA transport, virtual-switch and overlay-network processing, storage protocols, and security functions such as inline encryption.[3]
The most important of these is RDMA. RDMA lets one machine read from or write to the memory of another machine without involving either side's CPU or operating system kernel in the data path, producing a zero-copy transfer with very low latency. On native InfiniBand this is the default transport. On Ethernet, NVIDIA implements RDMA over Converged Ethernet, or RoCE. RoCE comes in two versions: RoCEv1 is a layer-2 protocol confined to a single Ethernet broadcast domain, while RoCEv2 encapsulates RDMA traffic in UDP/IP packets so it can be routed across layer-3 networks like ordinary IP traffic.[4][5]
A SmartNIC such as ConnectX should be distinguished from a data processing unit (DPU) such as NVIDIA's BlueField line. A DPU pairs the same ConnectX networking silicon with general-purpose Arm CPU cores and additional acceleration engines, so it can run an entire infrastructure software stack (hypervisor offload, software-defined storage, a distributed firewall) independently of the host. A ConnectX SuperNIC, by contrast, is purpose-built and tuned to push GPU-to-GPU traffic across an AI fabric at the highest possible bandwidth rather than to act as a self-contained computer.[6]
Mellanox Technologies was founded in May 1999 by former Intel and Galileo Technology engineers in Yokneam, Israel, and built its early business around InfiniBand silicon. The company shipped its first InfiniBand product line, InfiniBridge, around 2001, and went public on the NASDAQ in February 2007.[1] The ConnectX brand emerged in this period as a family of multi-protocol adapter ASICs supporting both InfiniBand and Ethernet through VPI.[1]
Over the following decade the line advanced roughly one generation every two to three years, each doubling peak bandwidth. ConnectX-3, introduced in June 2011, was billed as the industry's first FDR 56 Gb/s InfiniBand and 10/40 Gigabit Ethernet adapter.[7] ConnectX-4 reached 100 Gb/s (EDR InfiniBand), with sampling beginning in early 2015.[8] ConnectX-5 continued at the 100 Gb/s tier with deeper offloads and shipped in the 2017 timeframe.[9] ConnectX-6 doubled bandwidth again to 200 Gb/s.
On March 11, 2019, NVIDIA announced its intent to acquire Mellanox for roughly $6.9 billion, beating out other bidders. The deal closed on April 27, 2020 at approximately $7.0 billion after clearing U.S., European, and Chinese antitrust review.[2] Mellanox became the core of NVIDIA's networking division, and NVIDIA retired the Mellanox brand on new products, folding ConnectX, BlueField, the Quantum InfiniBand switches, and the Spectrum-X Ethernet switches into "NVIDIA Networking." The acquisition proved strategically decisive: as AI training shifted to clusters of thousands of GPUs, the interconnect became as important to overall performance as the GPUs themselves, and NVIDIA now owned both ends.[2]
The table below summarizes the recent ConnectX generations relevant to AI infrastructure. Speeds refer to the maximum aggregate per adapter; each generation supports a ladder of lower InfiniBand and Ethernet rates as well.
| Generation | Approx. availability | Max speed | InfiniBand | Host interface | Notable features |
|---|---|---|---|---|---|
| ConnectX-6 | 2020 | 200 Gb/s | HDR | PCIe Gen3 / Gen4 x16 | Sub-600 ns latency, ~200M messages/s, inline encryption (Dx variant) |
| ConnectX-7 | 2022 | 400 Gb/s | NDR | PCIe Gen5, up to x32 | In-network collective offload, ~330 to 370M messages/s, hardware root of trust |
| ConnectX-8 SuperNIC | 2024 to 2025 | 800 Gb/s | XDR | PCIe Gen6 x16 (64 GT/s), 48-lane integrated PCIe switch | RISC-V data-path accelerator, Spectrum-X congestion control, GB300 NVL72 deployment |
| ConnectX-9 SuperNIC | 2026 (announced) | 1.6 Tb/s | (Rubin era) | 200G PAM4 SerDes | Part of the Vera Rubin platform |
ConnectX-6 provides up to two ports of 200 Gb/s for InfiniBand (HDR) or Ethernet, with sub-600 nanosecond latency and around 200 million messages per second. It uses a PCIe Gen3 or Gen4 host interface at x16 and ships in many form factors including standard PCIe cards and Open Compute Project (OCP) 3.0 modules with QSFP56 connectors.[10] A widely deployed Ethernet-focused variant, ConnectX-6 Dx, added a hardware root of trust and inline IPsec and TLS encryption and was marketed as a secure SmartNIC.[9][11] ConnectX-6 was the networking workhorse of NVIDIA's Ampere-generation systems, and eight of them provided the cluster fabric in the DGX A100.
ConnectX-7 doubled peak bandwidth to 400 Gb/s, supporting NDR InfiniBand or 400 Gigabit Ethernet, and moved to a PCIe Gen5 host interface that can be configured with up to 32 lanes.[12] It sustains roughly 330 to 370 million messages per second and adds acceleration engines for MPI all-to-all and tag matching, a programmable datapath accelerator, advanced storage offloads, PTP timing accurate to about 16 nanoseconds, and secure boot with an on-chip hardware root of trust.[12]
ConnectX-7 is the network adapter of the Hopper generation. NVIDIA's DGX H100 system, announced in March 2022, integrates eight ConnectX-7 adapters running at 400 Gb/s (alongside BlueField-3 DPUs), doubling the cluster bandwidth of the previous DGX A100.[13] On the switch side, ConnectX-7 pairs with NVIDIA's Quantum-2 NDR InfiniBand switches and Spectrum-4 Ethernet switches, both of which use 400 Gb/s ports based on 100G-PAM4 signaling, to build the InfiniBand or Ethernet fabrics that connect H100 GPUs across a cluster.[14]
ConnectX-8 is the first member of the family marketed as a SuperNIC, a term NVIDIA uses for adapters tuned specifically for large-scale AI workloads. It was announced at NVIDIA's GTC conference on March 18, 2024 as part of the Quantum-X800 platform, where it pairs with the Quantum-X800 (Q3400) InfiniBand switch to deliver the industry's first end-to-end 800 Gb/s throughput. NVIDIA reported the platform delivers a roughly 5x increase in bandwidth capacity and a 9x increase in in-network computing, to 14.4 teraflops, using the fourth-generation Scalable Hierarchical Aggregation and Reduction Protocol (SHARPv4) compared with the previous generation.[15][16]
The adapter supports 800 Gb/s of networking, configurable as one port of XDR InfiniBand (four lanes at 200 Gb/s each) or as 400 Gb/s and 200 Gb/s Ethernet running over the Spectrum-X platform.[17][18] Its most distinctive architectural feature is the host interface: where a typical adapter exposes a single PCIe x16 connector, ConnectX-8 integrates 48 lanes of PCIe Gen6 (64 GT/s, backward compatible with Gen5) together with an on-board PCIe Gen6 switch. This consolidates GPU-to-GPU and GPU-to-NIC traffic into one device and removes the need for a separate, discrete PCIe switch chip on the server board.[18][19] NVIDIA states the integrated switch can roughly double peer-to-peer GPU bandwidth versus a Gen5 design and yield up to 2x higher NCCL all-to-all performance.[18] The adapter also carries a programmable data-path accelerator built on a RISC-V event processor for offloading custom networking functions, and it ships in single-cage OSFP and dual-port QSFP112 form factors.[19][20]
The seed framing of ConnectX-8 as a Blackwell and GB200 part needs a correction. While ConnectX-8 was unveiled alongside the Blackwell launch, its first production rack-scale deployment is in the GB300 NVL72 and HGX B300 systems of the Blackwell Ultra generation, where it serves as the scale-out NIC; NVIDIA describes the GB300 NVL72 as the first deployment of the PCIe Gen6 SuperNIC.[19][21] It is also used in RTX PRO Servers, providing 400 Gb/s of network bandwidth per GPU in a 2:1 GPU-to-NIC configuration.[18] For comparison with the rack-scale GB200 NVL72, the in-rack GPU-to-GPU links are handled by the NVSwitch-based NVLink fabric, while ConnectX-8 (or BlueField-3) carries traffic between racks across the InfiniBand or Ethernet back end.
Looking ahead, NVIDIA has announced ConnectX-9 as the networking adapter of its next platform, Vera Rubin, which was launched at CES in January 2026 with production expected in the second half of 2026. ConnectX-9 is designed to deliver 1.6 terabits per second of network bandwidth per GPU by moving to 200G-PAM4 SerDes signaling, doubling ConnectX-8 again.[22][23] It is one of six chips in the Rubin platform, alongside the Vera CPU, the Rubin GPU, the NVLink 6 switch, the BlueField-4 DPU, and the Spectrum-6 Ethernet switch.[22]
The dual-protocol nature of ConnectX is central to its appeal. InfiniBand was designed from the start as a lossless, low-latency, RDMA-native fabric for HPC, with credit-based flow control and adaptive routing that make it well suited to the tightly coupled, all-to-all traffic of large training jobs. NVIDIA's Quantum InfiniBand switches and ConnectX/BlueField adapters also implement SHARP, which performs reduction operations (summing gradients, for example) inside the network switches themselves rather than at the endpoints, cutting the volume of data that must traverse the fabric during collective operations.[15]
Ethernet, by contrast, is the universal language of data centers and clouds, and many operators prefer to run their AI back end on the same technology as the rest of their infrastructure. The historical weakness of Ethernet for this workload was packet loss and the resulting unpredictable tail latency. NVIDIA's answer is Spectrum-X, an Ethernet platform that combines Spectrum switches with ConnectX SuperNICs and uses adaptive routing and hardware congestion control to deliver near-lossless behavior and high effective bandwidth for RoCE traffic. Because the same ConnectX silicon can be configured for either fabric, customers can choose InfiniBand or Ethernet without changing adapter vendors, and NVIDIA captures the sale either way. The industry is also standardizing on lossless AI Ethernet through the Ultra Ethernet Consortium, which NVIDIA participates in.[16][18]
The performance of an AI cluster depends not just on raw link speed but on how directly data can move between GPUs on different servers. Three layered technologies make this possible on ConnectX.
RDMA provides the zero-copy, kernel-bypass transport. RoCE carries that transport over Ethernet (using a lossless or near-lossless fabric so that RDMA's assumptions hold). GPUDirect then removes the last bottleneck. GPUDirect RDMA, introduced with NVIDIA's Kepler-class GPUs and CUDA 5.0, exposes regions of GPU memory over the PCIe bus so that a ConnectX adapter can DMA data directly into or out of GPU memory, bypassing a staging copy through host CPU memory entirely.[5][24] The result is a path in which a gradient computed on a GPU in one server can land in the memory of a GPU in another server with no CPU involvement on either end, at sub-microsecond latency. A related capability, GPUDirect Storage, extends the same idea to NVMe storage so that data can stream from disk into GPU memory along a direct path. These technologies are the reason NVIDIA's networking and compute businesses reinforce each other so strongly: the value of a fast NIC is multiplied when it can talk straight to GPU memory.[24]
In a modern AI data center, GPUs are organized into two tiers of network. Inside a server or a rack, GPUs are connected by NVLink and NVSwitch at very high bandwidth. To scale beyond a single rack to the hundreds or thousands of GPUs needed to train a frontier model, those islands must be stitched together by a scale-out fabric, and this is where ConnectX lives. Each GPU is typically paired with one or more ConnectX adapters (or BlueField DPUs) that connect it to the InfiniBand or Ethernet network, so that collective operations such as all-reduce can span the entire cluster. The bandwidth and latency of this fabric directly bound how efficiently a large job can be parallelized, which is why each ConnectX generation has been timed to match a GPU generation: ConnectX-6 with Ampere, ConnectX-7 with Hopper, ConnectX-8 with Blackwell, and ConnectX-9 with Rubin.[13][18][22]
The strategic significance of the line is hard to overstate. By acquiring Mellanox and continuing to advance ConnectX, NVIDIA turned networking from a commodity component into a tightly co-designed part of its AI systems, sold together with GPUs, NVLink, Quantum InfiniBand, Spectrum-X Ethernet, and the Quantum-X Photonics co-packaged optics that are beginning to replace pluggable transceivers at the switch. The adapter that began as a Mellanox HPC product is now a core pillar of the most valuable computing platform in the AI era.[2][16]