AMD Helios
Last reviewed
May 31, 2026
Sources
12 citations
Review status
Source-backed
Revision
v1 · 2,181 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 31, 2026
Sources
12 citations
Review status
Source-backed
Revision
v1 · 2,181 words
Add missing citations, update stale details, or suggest a clearer explanation.
AMD Helios is a rack-scale artificial intelligence system from AMD, built around its Instinct MI400 series GPUs and designed so that a whole rack of accelerators behaves like one large machine rather than a row of separate servers. AMD first previewed Helios at its Advancing AI event on June 12, 2025, and showed a physical reference design at the Open Compute Project Global Summit on October 14, 2025, with volume availability targeted for 2026. [1][2][3] The design pairs the MI400 GPUs with sixth-generation EPYC "Venice" CPUs and Pensando "Vulcano" AI network cards, and it connects them using open industry standards rather than a single vendor's proprietary fabric. [1][2] Helios is AMD's most direct answer yet to NVIDIA's rack-scale machines, and it marks the point where AMD stopped selling AI mostly as chips and boards and started selling it as a full rack you wheel onto a data center floor.
For most of the last decade, an AI accelerator was a card or a board. You bought GPUs, put eight of them in a server, and wired many servers together with whatever network you had. That model started to strain as frontier models grew. Training and serving the largest models now spreads a single job across dozens or hundreds of GPUs that have to act like one tightly coupled system, and the limit is often the links between chips rather than the chips themselves. [4][5]
The industry's response is the rack-scale system, sometimes marketed as an "AI factory." Instead of treating the server as the unit of sale, vendors treat the whole rack as the product. Every GPU in the rack shares a fast internal fabric, the CPUs and network cards are co-designed with the accelerators, and power and liquid cooling are engineered for the rack as a whole. NVIDIA moved first here with its GB200 NVL72 and the later GB300 NVL72, which link 72 GPUs over NVLink so they look like a single accelerator to software. Helios is AMD's version of that idea, and the company has been explicit that it now competes at the rack level, not just chip against chip. [4][6]
This shift changes what buyers compare. A faster GPU helps, but a model trainer cares about how much memory and compute a rack can pool, how fast the GPUs talk to each other inside the rack, and how cleanly many racks scale out into a cluster. Helios is built to be judged on those terms.
Helios brings together three AMD product lines that were each refreshed for this generation.
The accelerators are the MI400 series, and the part aimed at the flagship Helios rack is the MI450, sold in variants such as the MI455X. AMD says each MI450 series GPU carries up to 432 GB of HBM4 memory with 19.6 TB/s of memory bandwidth, and that it delivers up to 40 PFLOPS of FP4 compute and 20 PFLOPS of FP8 compute. [1][2][7] Those memory figures are large by current standards, and AMD has leaned on capacity as a selling point because more memory per GPU lets a given model fit on fewer GPUs.
The host processors are EPYC "Venice" CPUs, AMD's sixth-generation server chips based on the Zen 6 core, with PCIe Gen6 and high core counts. They handle the parts of a workload that do not suit the GPU, feed data to the accelerators, and run the rack's control software. [1][8] Venice is itself a 2026 product, so Helios lines up the GPU and CPU roadmaps in the same window.
The network cards are the Pensando "Vulcano" AI NICs, rated at 800 Gb/s, which connect each node out to the rest of the cluster. [1][4] Pensando is the data-center networking group AMD acquired, and Vulcano carries traffic between racks rather than inside a single rack.
Put together at rack scale, AMD's figures for a Helios rack of 72 MI450 GPUs are 31 TB of total HBM4 memory, 1.4 PB/s of aggregate memory bandwidth, 2.9 EFLOPS of FP4 compute, and 1.4 EFLOPS of FP8 compute, with 260 TB/s of scale-up bandwidth among the GPUs and 43 TB/s of Ethernet scale-out bandwidth. [3][4] All of those are vendor numbers tied to hardware that was not shipping when AMD announced it, so they are targets rather than measured results.
The sharpest difference between Helios and NVIDIA's racks is not the silicon. It is the wiring philosophy. NVIDIA's NVL72 systems use NVLink and NVSwitch, interconnects that NVIDIA controls end to end. That gives NVIDIA a tightly integrated product, and it also gives NVIDIA leverage, because a customer buying into NVLink is buying into one supplier for the fabric. [4][6]
AMD took the opposite route and built Helios on open standards across two different jobs.
For scale-up, meaning the fast links among GPUs inside a rack, AMD is backing UALink, the Ultra Accelerator Link standard developed by a consortium that includes AMD, Broadcom, and others as an open alternative to NVLink. The aim is that any vendor can build UALink-capable accelerators and switches that interoperate. [1][4] In practice the first Helios generation runs this scale-up traffic as UALink over Ethernet, layering AMD's own Infinity Fabric protocol on top of Ethernet while native UALink hardware matures. [4]
For scale-out, meaning the network that joins many racks into a cluster, AMD uses Ultra Ethernet, an effort under the Ultra Ethernet Consortium to tune standard Ethernet for AI traffic. The Pensando Vulcano NICs are the on-ramp to that fabric. [1][3] Using Ethernet here lets operators reuse familiar tooling and a broad supplier base instead of a proprietary network.
The rack itself is the most pointed part of the strategy, and it is worth being precise about who did what. The chassis follows the Open Rack Wide specification, a double-wide design that Meta introduced and contributed to the Open Compute Project for the power, cooling, and serviceability needs of next-generation AI systems. AMD did not contribute the rack standard. It built Helios on Meta's open design and aligned it with other open compute standards including OCP DC-MHS, UALink, and Ultra Ethernet. [3][9][10] The rack adds quick-disconnect liquid cooling, the double-wide layout for easier servicing, and standards-based Ethernet for multi-path resiliency. [3][12] The point of all this is to let other vendors and cloud operators build and service compatible systems rather than depending on one supplier for every part. As AMD data-center chief Forrest Norrod put it, "Open collaboration is key to scaling AI efficiently. With 'Helios,' we're turning open standards into real, deployable systems." [3]
AMD framed Helios against two NVIDIA generations. The near-term target is the GB200 and GB300 NVL72, which also place 72 GPUs in a rack over NVLink. The forward-looking comparison is NVIDIA's Vera Rubin platform, expected in a similar 2026 window and described by NVIDIA in larger rack configurations such as an NVL144. [4][6] AMD said a Helios rack offers 50% more memory capacity than NVIDIA's Vera Rubin system, and it claimed up to 36 times higher performance than its own previous generation. [3][6]
Those comparisons deserve caution. They come from AMD, they were made before either company's 2026 systems were on the market, and rack-level numbers depend heavily on how each vendor counts memory, which number format is quoted, and what the real software stack achieves. Independent benchmarks of shipping Helios racks against shipping NVIDIA racks did not exist at announcement, so the honest summary is that AMD claimed parity or better on paper and will have to prove it in deployment.
| Item | AMD Helios (rack) | Notes |
|---|---|---|
| Status | Previewed June 2025, reference design shown Oct 2025, targeted 2026 | Target, not shipping at announcement |
| GPUs per rack | 72 (Instinct MI450, MI400 series) | AMD figure |
| HBM4 per GPU | Up to 432 GB | AMD spec, target |
| Memory bandwidth per GPU | 19.6 TB/s | AMD spec, target |
| FP4 compute per GPU | Up to 40 PFLOPS | AMD spec, target |
| FP8 compute per GPU | Up to 20 PFLOPS | AMD spec, target |
| Aggregate HBM4 per rack | 31 TB | AMD figure, target |
| Aggregate memory bandwidth | 1.4 PB/s | AMD figure, target |
| Aggregate FP4 compute | 2.9 EFLOPS | AMD figure, target |
| Aggregate FP8 compute | 1.4 EFLOPS | AMD figure, target |
| Scale-up bandwidth | 260 TB/s | AMD figure, target |
| Scale-out bandwidth | 43 TB/s Ethernet | AMD figure, target |
| CPU | EPYC "Venice" (Zen 6, PCIe Gen6) | 2026 product |
| Scale-out NIC | Pensando "Vulcano", 800 Gb/s | AMD figure |
| Scale-up interconnect | UALink, run as UALink over Ethernet initially | Open standard |
| Scale-out interconnect | Ultra Ethernet | Open standard |
| Rack design | OCP Open Rack Wide, double-wide, liquid cooled | Meta contributed ORW to OCP |
Helios is the centerpiece of AMD's attempt to win real share in data-center AI, a market NVIDIA has dominated. Selling racks rather than chips changes AMD's position in two ways. It raises the value of each deal, since a rack bundles GPUs, CPUs, NICs, and integration work. It also makes AMD a credible single-vendor option for an operator that wants a turnkey AI cluster, which until now usually meant going to NVIDIA. [4][6]
The early commercial signals were strong. On October 6, 2025, AMD and OpenAI announced a partnership for OpenAI to deploy up to 6 gigawatts of AMD Instinct GPUs over several years, starting with 1 gigawatt of MI450 in the second half of 2026, alongside a warrant that could give OpenAI up to 160 million AMD shares. [11] About a week later, Oracle said it would deploy 50,000 AMD MI450 GPUs starting in the third quarter of 2026, a build that AMD watchers pegged at roughly 700 Helios racks and around 200 megawatts of power. [4] Commitments at that scale, from buyers who also lean heavily on NVIDIA, suggest that large customers want a second credible supplier and view Helios as one. The open-standards approach reinforces that, because UALink, Ultra Ethernet, and a Meta-derived OCP rack lower the cost of running AMD gear next to everything else in the building.
There are real limits to keep in mind. As of the 2025 announcements, Helios was a roadmap and a reference design, not a product anyone could buy, and the headline numbers are AMD's own projections for unshipped hardware. The MI450 GPUs, EPYC Venice CPUs, and a production UALink fabric all have to arrive on schedule and work together at full rack scale. AMD's software stack, ROCm, has historically trailed NVIDIA's CUDA in maturity and breadth, and rack-scale performance depends as much on that software as on the silicon. The open interconnects are also still early, so the interoperability promise is partly aspirational until several vendors ship compatible parts. Helios is a clear statement of direction for AMD's AI data center strategy, and its real standing against NVIDIA will be settled by 2026 deployments rather than by launch-day slides.