NVIDIA BlueField
Last reviewed
Jun 3, 2026
Sources
15 citations
Review status
Source-backed
Revision
v1 · 1,952 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 3, 2026
Sources
15 citations
Review status
Source-backed
Revision
v1 · 1,952 words
Add missing citations, update stale details, or suggest a clearer explanation.
NVIDIA BlueField is a family of data processing units (DPUs) designed and sold by Nvidia. A DPU is a programmable system-on-chip that combines Arm CPU cores, a high-speed network interface, and a set of hardware accelerators on a single device, allowing it to take over infrastructure tasks, networking, storage, and security, that would otherwise consume cycles on the host server's main processor. By moving this "data center tax" off the host CPU and onto a dedicated chip on the network adapter, BlueField frees the server's general-purpose cores to run customer applications and, in modern AI clusters, to keep GPUs fed with data. The line originated with the networking company Mellanox, which Nvidia acquired in 2020, and has become a core building block of Nvidia's data-center networking portfolio alongside ConnectX network adapters and the Spectrum-X Ethernet platform. [1][2]
A conventional server runs its operating system, virtualization layer, network virtual switch, storage stack, and security agents on the same CPU that runs the user's workloads. As network speeds climbed past 100 and then 400 gigabits per second, the share of CPU cores consumed by simply moving and protecting packets grew large enough that hyperscalers began designing dedicated silicon to absorb it. Nvidia describes the DPU as the third pillar of the data center alongside the CPU and the GPU. [2][3]
BlueField integrates four elements that distinguish it from an ordinary network interface card (NIC):
Because the DPU runs its own software and sits between the host and the network, it can enforce a "zero-trust" security boundary: the infrastructure control plane runs on the DPU and is isolated from any compromise of the tenant operating system on the host. This is the foundation of the multi-tenant cloud and software-defined networking use cases that drove BlueField's early adoption. [2][4]
BlueField began at Mellanox Technologies, an Israeli-American supplier of high-performance interconnects. Mellanox combined a coherent mesh of Arm Cortex-A72 cores with its ConnectX network engine and a PCIe switch on a single chip to create the first BlueField devices, which it marketed as "smart NICs" and I/O processing units. Mellanox publicly demonstrated the second-generation design, BlueField-2, at VMworld in 2019. [5][6]
Nvidia announced its intent to acquire Mellanox for approximately 6.9 billion dollars on March 11, 2019, and completed the deal on April 27, 2020, for a transaction value of roughly 7 billion dollars. The acquisition brought the BlueField, ConnectX, and InfiniBand product lines into Nvidia and is the reason BlueField is sometimes still referred to by its original "Mellanox BlueField" branding on older hardware. After the close, Nvidia rebranded the chips as data processing units and folded them into a multi-generation roadmap presented at its GTC conferences. [1][7]
Nvidia has shipped three generations of BlueField under its own name (BlueField-2 through BlueField-4), each roughly doubling network throughput while adding Arm compute and accelerators. The table below summarizes the verified specifications of each generation.
| Generation | First announced | General availability | Arm cores | Core type | Network throughput | Memory | Process / transistors |
|---|---|---|---|---|---|---|---|
| BlueField-2 | October 2020 | 2021 | 8 | Arm Cortex-A72 | 200 Gb/s | 16 to 32 GB DDR4 | n/a |
| BlueField-2X | 2020 | 2021 | 8 | Arm Cortex-A72 (+ Ampere GPU) | 200 Gb/s | 16 to 32 GB DDR4 | n/a |
| BlueField-3 | April 2021 (GTC) | March 2023 (GTC) | 16 | Arm Cortex-A78 | 400 Gb/s | 32 GB DDR5 (5600 MT/s) | 7 nm, ~22 billion transistors |
| BlueField-4 | 2025 (Vera Rubin roadmap) | 2026 (with Vera Rubin) | 64 | Grace CPU (Arm Neoverse V2) | 800 Gb/s | 128 GB LPDDR5X (~250 GB/s) | n/a |
BlueField-2 was the first DPU released under the Nvidia brand. It pairs eight 64-bit Armv8 Cortex-A72 cores with a ConnectX-6 Dx network engine, supporting two ports of 25, 50, or 100 Gb/s, or a single port at 200 Gb/s, over either Ethernet or InfiniBand. It includes hardware accelerators for encryption, storage offload (including elastic block storage and NVMe over Fabrics), and RDMA/GPUDirect, and it carries onboard DDR4 memory. Nvidia positioned it to offload virtualization, networking, and security from data-center hosts; a variant called BlueField-2X added an Nvidia Ampere GPU on the same card to accelerate AI-based security and telemetry, though it was less widely deployed than the base part. [6][8]
BlueField-3, unveiled at GTC in April 2021, was Nvidia's first 400 Gb/s DPU. It integrates 16 Arm Cortex-A78 cores and about 22 billion transistors built on a 7-nanometer process, supports Ethernet at up to 400 Gb/s and InfiniBand up to NDR, and connects to the host over a PCIe Gen5 x16 link. Production parts carry 32 GB of on-board DDR5 ECC memory clocked at 5600 MT/s. Relative to BlueField-2, Nvidia cited roughly four times the compute, up to four times faster cryptographic acceleration, two times faster storage processing, and four times the memory bandwidth. [9][10]
Although announced in 2021, BlueField-3 reached general availability in March 2023, when Nvidia declared it in volume production at GTC. Early adopters named by Nvidia included Oracle Cloud Infrastructure, CoreWeave, Microsoft Azure, Baidu, JD.com, and Tencent, reflecting the DPU's primary role in cloud multi-tenancy. BlueField-3 is also sold in a network-accelerator configuration, the BlueField-3 SuperNIC, which is a 400 Gb/s component of the Spectrum-X Ethernet platform. [10][11]
BlueField-4 is the current generation, designed for the Vera Rubin era of AI infrastructure. Nvidia had sketched a BlueField-4 with 64 billion transistors and 800 Gb/s on its DPU roadmap as early as 2021, but the shipping product is a substantially different and more ambitious design than that early slide implied. Rather than a cluster of general-purpose Arm cores, BlueField-4 is a dual-die package that pairs a 64-core Nvidia Grace CPU (built on Arm Neoverse V2 cores) with an integrated ConnectX-9 networking engine. It delivers up to 800 Gb/s of Ethernet or InfiniBand connectivity, carries 128 GB of LPDDR5X memory at roughly 250 GB/s, and connects over PCIe Gen6. Nvidia states it provides about six times the compute of BlueField-3 and can support AI factories up to four times larger. [3][12]
BlueField-4 was tied to the Vera Rubin platform when that roadmap was presented at GTC in March 2025, detailed further at GTC in Washington, D.C., in October 2025 and at CES in January 2026, and is slated for early availability alongside Vera Rubin systems in 2026. It is one of the six co-designed chips of the Rubin platform, together with the Vera CPU, the Rubin GPU, NVLink, the ConnectX-9 SuperNIC, and Spectrum-class Ethernet switches. A storage-focused variant, BlueField-4 STX, was launched at GTC in March 2026 as a modular reference architecture for AI-native storage. In Rubin-generation systems, BlueField-4 underpins a new Inference Context Memory Storage platform: it runs the key-value (KV) cache input/output plane and terminates NVMe-over-Fabrics, object, and RDMA storage protocols, offloading the long-context memory of large language model inference so that GPUs spend their time computing rather than waiting on storage. BlueField-4 also introduces the Advanced Secure Trusted Resource Architecture (ASTRA), which provides a single trusted control point with isolated control, data, and management planes for provisioning and securing large AI environments. [3][13]
BlueField hardware is programmed through DOCA, Nvidia's software framework for the DPU and SuperNIC. DOCA is to BlueField what CUDA is to Nvidia's GPUs: a developer platform that exposes the chip's accelerators through a consistent set of APIs so that infrastructure software can be written once and run across DPU generations. It consists of two parts: a software development kit (SDK) with open, industry-standard drivers and libraries, including the Data Plane Development Kit (DPDK) and the P4 language for networking and security and the Storage Performance Development Kit (SPDK) for storage; and a runtime, shipped with every BlueField, for provisioning, deploying, and orchestrating containerized services across hundreds or thousands of DPUs in a data center. Using DOCA, developers build cloud-native, DPU-accelerated services for software-defined networking, software-defined storage, telemetry, and zero-trust security. With BlueField-4, Nvidia emphasizes native support for DOCA microservices, packaging infrastructure functions as containers that run directly on the DPU. [4][14]
BlueField's offload model serves several distinct workloads:
As AI training and inference moved to clusters of tens of thousands of GPUs, the network became as important as the compute, and the DPU became the device that keeps those networks orderly and secure at scale. BlueField-3 SuperNICs and ConnectX adapters provide the endpoint intelligence for the Spectrum-X Ethernet platform, which Nvidia markets as a fabric purpose-built for AI; in that role the DPU handles adaptive routing, congestion control, and performance isolation that let Ethernet behave more like a lossless AI interconnect. [11][15]
With BlueField-4 and the Vera Rubin platform, Nvidia repositions the DPU from a networking offload engine into what it calls the "operating system of the AI factory." By embedding a full 64-core Grace CPU on the device and tying it to the inference KV-cache and storage path, BlueField-4 is meant to run the infrastructure of an entire AI data center, security, multi-tenancy, storage, and the memory plane of long-context inference, as a self-contained, accelerated layer beneath the GPUs. This trajectory, from a Mellanox smart NIC into a central pillar of Nvidia's data-center strategy, reflects the broader industry shift toward disaggregated, software-defined infrastructure in which dedicated silicon, rather than the host CPU, runs the data center. [3][13]