Ultra Ethernet

AI Hardware AI Infrastructure Open Source AI

16 min read

Updated Jun 24, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 24, 2026

Fact-checked

In review queue

Sources

20 citations

Revision

v2 · 3,274 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Ultra Ethernet is an open networking specification that reworks Ethernet into a high performance fabric for large AI training clusters and high performance computing, giving operators an interoperable, multi vendor alternative to proprietary fabrics like InfiniBand. It is developed by the Ultra Ethernet Consortium (UEC), a Joint Development Foundation project hosted by the Linux Foundation, and its central piece is a new transport protocol called Ultra Ethernet Transport (UET). The consortium published version 1.0 of the specification on June 11, 2025, a 562 page stack of documents designed to scale a single fabric to millions of endpoints. ^[1]^[2]^[16] The goal is easy to state and hard to engineer. Take the cheap, ubiquitous, multi vendor Ethernet that already wires up most data centers, and make it good enough to move the enormous, bursty, synchronized traffic that modern AI fabrics generate, so operators do not have to reach for a proprietary fabric to get top performance. ^[3]^[4]

"As Chair of the Ultra Ethernet Consortium, I'm proud to announce the release of the UEC 1.0 specification, a major milestone in our mission to redefine Ethernet for the age of AI and high-performance computing," said Dr. J Metz, Chair of the UEC Steering Committee, at the launch. "This standard is the result of unprecedented collaboration across the industry, and it delivers the low-latency, high-bandwidth, and intelligent transport needed for the most demanding workloads of today and tomorrow." ^[1]

Why did standard Ethernet need work for AI?

Training a large model spreads the computation across thousands or tens of thousands of accelerators that have to exchange gradients and activations constantly. The network sits on the critical path. When a collective operation such as an all reduce runs, every participating GPU waits for the slowest message, so a single congested link or one dropped packet can stall the whole step. AI traffic also looks nothing like ordinary data center traffic. It arrives in tight, synchronized bursts from many senders at once, and it uses a small number of very large flows rather than many small ones, so those few flows can easily pile onto the same path and create hotspots. ^[4]^[5]

The usual way to run RDMA over Ethernet is RoCE, short for RDMA over Converged Ethernet. RoCE works, and hyperscalers have run very large clusters on it, but it carries assumptions that fight against AI scale. Classic RoCE wants a lossless network, which it gets through Priority Flow Control, and it expects packets within a flow to arrive in order. Lossless behavior is fragile at scale because a pause frame can spread backward through the fabric and cause congestion to bloom in places far from the original problem. The in order requirement is the bigger limiter. Because each flow must stay on one path to preserve ordering, the standard approach cannot easily spread a single large flow across the many equal cost links that a Clos fabric provides, so expensive bandwidth sits idle while one path saturates. The common workaround, hashing flows onto paths with ECMP, falls apart when there are only a few elephant flows, because two of them can hash to the same link and there is nothing to rebalance them. Tuning RoCE for a giant training cluster has become specialist work, and the result is often vendor specific. ^[5]^[6]

Ultra Ethernet keeps the parts of Ethernet that make it attractive, the open standards, the broad supplier base, the existing optics and switch silicon, and replaces the transport behavior that does not suit AI. It is built to tolerate some loss rather than demand a perfectly lossless fabric, to use every available path at once, and to hand data to the application even when packets show up out of order. A stated design tenet is that no existing Ethernet infrastructure has to be ripped out to adopt it. ^[3]^[4]

Who is in the Ultra Ethernet Consortium?

The Ultra Ethernet Consortium was announced on July 19, 2023, as a Joint Development Foundation project hosted by the Linux Foundation. The ten founding members were AMD, Arista, Broadcom, Cisco, Eviden (an Atos business), Hewlett Packard Enterprise, Intel, Meta, and Microsoft, a lineup that pairs the companies building the chips and switches with the companies running some of the largest AI fleets in the world. The founders seeded a set of working groups, for the physical layer, the link layer, the transport layer, and the software layer. ^[7]^[8]

Membership grew fast. The consortium began accepting new members in November 2023, and in the four months that followed it grew 450 percent, reaching 55 members by March 2024 with 715 individual experts active across eight working groups. ^[17] It welcomed 40 more organizations through 2024 to reach 97 members by that August, then added 27 more in 2025. ^[18]^[19] By the end of 2024 it counted more than 100 member companies and over 1,500 participants, and it has been described as one of the fastest growing efforts in the data infrastructure ecosystem. ^[4]^[20]

The table below tracks that growth from primary and trade sources.

Milestone	Member companies	Notes
Launch (July 2023)	10 founding members	AMD, Arista, Broadcom, Cisco, Eviden, HPE, Intel, Meta, Microsoft ^[7]
March 2024	55	450% growth since opening to new members in Nov 2023; 715 individual participants in 8 working groups ^[17]
August 2024	97	40 new organizations added during 2024, including NVIDIA ^[18]
End of 2024	100+	Over 1,500 participants ^[20]
2025	+27 new members	Reported in the chair's year in review ^[19]

Notably, NVIDIA joined the consortium in 2024, after the founding group formed. ^[9]^[18] That is worth a moment, because NVIDIA owns InfiniBand through its Mellanox acquisition and also sells Spectrum X, its own Ethernet platform tuned for AI. Its participation signals that even the dominant supplier of AI fabrics expects a standard, multi vendor Ethernet transport to matter, while it continues to sell its own offerings alongside. ^[18]

What is in the 1.0 specification?

Rather than a single short document, the 1.0 release is a large stack of specifications, 562 pages in total, that together cover the network from the wire up to the software interface. For scale, that is roughly four times the length of the QUIC transport standard (RFC 9000, 151 pages), and the transport section alone runs to 329 pages. ^[16] The consortium also published a condensed paper written by the primary designers to make the full text easier to digest. The work is organized into layers, and the released material spans the physical layer, the link layer, the transport layer, and the software API, plus supporting areas such as storage, management, compliance, and debug. ^[1]^[2]^[10]

Hugh Holbrook, Chair of the UEC Technical Advisory Committee, summarized the scope at release: "The Ultra Ethernet 1.0 Specification is the result of an outstanding collaboration of AI, HPC, and networking experts, system and silicon vendors, and network operators. It incorporates a wealth of knowledge, experience, and ideas related to applications, transport protocols, congestion control, direct memory access, Ethernet link and PHY technologies, and network security." ^[1]

The table below sketches the layers and what each one contributes.

Layer	What it covers	Examples of what 1.0 adds
Physical	Signaling and the electrical or optical link	Profiles aligned with existing high speed Ethernet, with support for QSFP-DD and OSFP optics so current cabling carries over
Link	Behavior on a single hop	Link Level Retry (LLR) to recover lost frames locally, and credit based flow control for more precise back pressure than blunt pause
Transport	End to end delivery, the heart of the spec	Ultra Ethernet Transport with multipath packet spraying, out of order delivery, congestion control, and built in security
Software	The interface applications use	An open, largely connectionless API so RDMA style and collective workloads can adopt it without a rewrite

The link layer additions are quietly important. Link Level Retry lets two adjacent devices resend a dropped frame on the spot instead of waiting for the endpoints to notice and recover, which keeps a little packet loss from turning into a large latency spike. Credit based flow control gives finer grained back pressure than the all or nothing pause of older Ethernet, so the network can slow a sender gently rather than freezing a whole priority class. ^[2]^[10]

How does Ultra Ethernet Transport work?

UET is the reason the specification exists, and it is where the AI specific thinking lives. It runs on top of standard IP and UDP, and a few ideas do most of the work. ^[16]

The first is multipath packet spraying. Instead of pinning a flow to one path, UET sprays the packets of a single message across many paths through the fabric at once. A large all reduce that would have hammered one link now fans out across the whole Clos, which raises link utilization and smooths out the hotspots that wreck tail latency. Spraying only works if the receiver can cope with packets that arrive in a jumble, which leads to the second idea. ^[3]^[5]

UET supports out of order delivery to the upper layers. Traditional RDMA insists on strict ordering, so the moment you spray packets you break it. UET instead lets the transport place incoming data into the right memory location as it arrives, even if packet seven shows up before packet five, and it signals completion to the application using the operation semantics rather than raw byte order. For the bulk transfers that dominate AI traffic this removes the ordering tax and is what makes spraying practical. The transport offers several delivery modes, so workloads that still want ordered or idempotent semantics can ask for them. ^[2]^[3]

Third is congestion control built for this traffic. UET defines new sender driven and receiver driven control algorithms so the fabric can react fast to the synchronized bursts of incast, where many GPUs blast one destination at the same instant, while keeping queues shallow and throughput high. Because the design tolerates a little loss and recovers quickly, it does not lean on a perfectly lossless network the way classic RoCE does, which sidesteps the spreading congestion that a pause frame can cause. ^[2]^[5]

Fourth is security as a first class feature rather than a bolt on. UET defines a Transport Security Sub-layer that borrows proven ideas from IPsec and from Google's open PSP protocol, using AES-GCM encryption, key derivation, and replay protection to authenticate and encrypt traffic at the transport level. It is built to keep working with packet spraying, and it uses group keying so the many endpoints in one security domain can talk securely without each pair holding heavy per connection state. That matters when many customers or teams share one cluster and need real isolation without losing performance. ^[11]^[12]

UET also leans on lightweight, mostly connectionless state instead of the heavy long lived connections of older transports. The API is connectionless, and a peer to peer reliability context can be set up without extra latency, sometimes established by the first packet in nanoseconds. That choice is what lets a single endpoint talk to a vast number of peers, which is exactly the pattern in a many to many collective across a huge cluster. ^[2]^[10]

It is worth being precise about where the intelligence sits. In this design the fabric of switches can stay relatively simple, spraying packets and signaling congestion, while the smart, stateful work of reassembly, retransmission, and completion lives at the endpoints in the network interface. That split matters for cost and scale, because plain Ethernet switch silicon is cheap and plentiful, and pushing the hard parts to the NIC means the fabric does not need expensive per flow state in every switch. It does raise the bar for the endpoint hardware, since the NIC or accelerator now has to track out of order arrivals and drive congestion control at line rate, which is one reason mature implementations depend on purpose built silicon. ^[5]^[6]

How does Ultra Ethernet differ from InfiniBand and UALink?

Ultra Ethernet is a scale out technology. Scale out means the back end network that ties together the many servers, racks, and pods of a cluster, the layer where a training job spanning thousands of accelerators does its cross node communication. This is the territory InfiniBand has owned in the largest AI and HPC systems, and matching InfiniBand class performance on open Ethernet at that scale is the explicit aim. The specification is designed to support a single fabric scaling toward a million endpoints, which is the order of magnitude that the biggest planned AI builds are reaching for. ^[3]^[4]^[16]

Scale out is a different problem from scale up, and the two are often confused. Scale up is the dense, very high bandwidth link inside a single server or rack that lets a handful of accelerators behave almost like one big accelerator with shared memory, the role NVLink plays for NVIDIA. The open standard aimed at that job is UALink, short for Ultra Accelerator Link, a memory semantic interconnect that ties together up to 1,024 accelerators over very short distances inside a pod. UALink and Ultra Ethernet are complementary rather than competing. A typical large AI infrastructure build would use a scale up fabric such as UALink inside each pod to bind accelerators tightly, then use Ultra Ethernet as the scale out fabric to connect those pods into a cluster of many thousands of accelerators. The comparison below lines up the three approaches at the back end layer. ^[3]^[13]

	Ultra Ethernet (UET)	InfiniBand	RoCE v2 (classic)
Type	Open, multi vendor standard	Largely single vendor (NVIDIA)	Open standard, but tuning often vendor specific
Role	Scale out back end fabric	Scale out back end fabric	Scale out back end fabric
Path use	Multipath packet spraying across many links	Adaptive routing	Typically one path per flow
Ordering	Out of order delivery to the application	In order within a flow	In order within a flow
Loss model	Tolerates some loss, fast recovery	Engineered lossless	Wants lossless via Priority Flow Control
Congestion control	New sender and receiver driven algorithms	Credit based, mature	DCQCN and similar, needs careful tuning
Security	Encryption and authentication in the transport	Add on	Add on

Adoption and significance

The importance of Ultra Ethernet is mostly economic and strategic. AI fabrics have become one of the larger line items in a data center build, and a single vendor fabric concentrates both cost and supply risk. An open standard backed by nearly every relevant chip, switch, and NIC maker, and by the hyperscalers writing the checks, gives buyers a second path with real competition behind it. ^[1]^[7]

Real silicon is the gating factor, and it is arriving. Broadcom began shipping its Tomahawk Ultra switch in July 2025, a chip aimed at low latency AI and HPC fabrics that the company rates at 250 nanoseconds of switch latency, and NIC and accelerator vendors including AMD, by way of its Pensando networking line, are building UET capable endpoints. ^[14]^[15] Because the physical and link choices stay close to mainstream Ethernet, switches and optics can be reused, which lowers the barrier to deployment compared with adopting an entirely separate fabric. The software story helps too. A connectionless API that fits existing models lets the collective communication libraries that AI frameworks already call sit on top of UET, so an operator can in principle swap the fabric underneath a training job rather than rebuilding the stack. The first wave of products that fully implement the 1.0 transport, expected through late 2025 and into 2026, is what turns the specification from paper into running clusters. ^[2]^[4]

Writing at the end of 2025, J Metz framed the spec as more than a feature list: "UEC is not just a collection of features. It is an end-to-end system, built for real workloads and real environments." ^[19]

What are the limitations of Ultra Ethernet?

A published specification is not the same as a deployed, debugged ecosystem. The hard part of any fabric is interoperability at scale, and the proof will come from multi vendor clusters with tens of thousands of endpoints running real training jobs, not from the document itself. The spec is also enormous, 562 pages, and a long, complex standard takes time to implement consistently across vendors. ^[10]^[16] InfiniBand has a long head start, a mature software stack, and years of field hardening, so it will not be displaced quickly even where Ultra Ethernet matches it on paper. NVIDIA also continues to push Spectrum X, its own AI tuned Ethernet, which means even within the Ethernet camp there will be more than one way to build a fabric. ^[4] And out of order delivery, while powerful, pushes more reassembly and state handling onto the endpoint, so the gains depend on NICs and accelerators that implement the transport efficiently in hardware. ^[6]^[10]

Even with those caveats, Ultra Ethernet is the clearest sign yet that the industry wants the back end of AI clusters to run on an open standard. Whether it reaches InfiniBand class performance in practice will be settled in production over the next few years, but the direction is set and the membership behind it is broad. ^[1]^[3]

References

Ultra Ethernet Consortium. "Ultra Ethernet Consortium (UEC) Launches Specification 1.0 Transforming Ethernet for AI and HPC at Scale." ultraethernet.org, June 11, 2025. https://ultraethernet.org/ultra-ethernet-consortium-uec-launches-specification-1-0-transforming-ethernet-for-ai-and-hpc-at-scale/ ↩
Ultra Ethernet Consortium. "Ultra Ethernet Specification v1.0." ultraethernet.org, June 11, 2025. https://ultraethernet.org/wp-content/uploads/sites/20/2025/06/UE-Specification-6.11.25.pdf ↩
Broadcom. "The Age of Ultra Ethernet." Broadcom Blog, 2025. https://www.broadcom.com/blog/the-age-of-ultra-ethernet ↩
"Ultra Ethernet Has Arrived: One Network to Rule Them All?" HPCwire, September 9, 2025. https://www.hpcwire.com/2025/09/09/ultra-ethernet-has-arrived-one-network-to-rule-them-all/ ↩
Arista Networks. "Demystifying Ultra Ethernet." Arista Blog, 2025. https://blogs.arista.com/blog/demystifying-ultra-ethernet ↩
Tom Herbert. "A (mostly) Unbiased Review of the Ultra Ethernet Specification v1.0." Medium, 2025. https://medium.com/@tom_84912/a-mostly-unbiased-review-of-the-ultra-ethernet-specification-10d816227839 ↩
The Linux Foundation. "Leading Cloud Service, Semiconductor, and System Providers Unite to Form Ultra Ethernet Consortium." linuxfoundation.org, July 19, 2023. https://www.linuxfoundation.org/press/announcing-ultra-ethernet-consortium-uec ↩
AMD. "AMD Founding Member of the Ultra Ethernet Consortium (UEC)." AMD Blogs, July 2023. https://www.amd.com/en/blogs/2023/amd-founding-member-of-the-ultra-ethernet-consorti.html ↩
"NVIDIA Has Joined the Ultra Ethernet Consortium." Futuriom, September 2024. https://www.futuriom.com/articles/news/nvidia-has-joined-the-ultra-ethernet-consortium/2024/09 ↩
Sean Michael Kerner. "Ultra Ethernet Consortium publishes 1.0 specification, readies Ethernet for HPC, AI." Network World, June 2025. https://www.networkworld.com/article/4006285/ultra-ethernet-consortium-publishes-1-0-specification-readies-ethernet-for-hpc-ai.html ↩
"Ultra Ethernet Security (UET-TSS) Tailored For AI And HPC." Semiconductor Engineering, 2025. https://semiengineering.com/ultra-ethernet-security-uet-tss-tailored-for-ai-and-hpc/ ↩
Rambus. "Ultra Ethernet Security: Protecting AI/HPC at Scale." Rambus Blog, 2025. https://www.rambus.com/blogs/ultra-ethernet-security-protecting-ai-hpc-at-scale/ ↩
"Ultra Ethernet and UALink: Scalable AI Networks." Synopsys, 2025. https://www.synopsys.com/articles/ultra-ethernet-ualink-ai-networks.html ↩
"Broadcom Ships Tomahawk Ultra Ethernet Switch with 250ns Latency for AI and HPC." HPCwire, July 15, 2025. https://www.hpcwire.com/off-the-wire/broadcom-ships-tomahawk-ultra-ethernet-switch-with-250ns-latency-for-ai-and-hpc/ ↩
"Ultra Ethernet Consortium launches 1.0 specification." Data Center Dynamics, June 2025. https://www.datacenterdynamics.com/en/news/ultra-ethernet-consortium-launches-10-specification/ ↩
Tom Herbert. "A (mostly) Unbiased Review of the Ultra Ethernet Specification v1.0" (page count and transport section size). Medium, June 2025. https://medium.com/@tom_84912/a-mostly-unbiased-review-of-the-ultra-ethernet-specification-10d816227839 ↩
The Linux Foundation. "Ultra Ethernet Consortium Experiences Exponential Growth in Support of Ethernet for High-Performance AI and HPC Networking." linuxfoundation.org, March 19, 2024. https://www.linuxfoundation.org/press/ultra-ethernet-consortium-experiences-exponential-growth-with-support-for-high-performance-computing-and-ai ↩
Ultra Ethernet Consortium. "Ultra Ethernet Consortium Welcomes 40 New Industry Leaders." ultraethernet.org, August 29, 2024. https://ultraethernet.org/ultra-ethernet-consortium-welcomes-40-new-industry-leaders/ ↩
Ultra Ethernet Consortium. "UEC 2025 in Review: Preparing for What Comes Next, A Letter from UEC's Chair." ultraethernet.org, December 10, 2025. https://ultraethernet.org/uec-2025-in-review-preparing-for-what-comes-next-a-letter-from-uecs-chair/ ↩
STORDIS. "Ultra Ethernet Consortium Explained: How UEC Is Redefining AI and HPC Networking." stordis.com, 2025. https://stordis.com/ultra-ethernet-consortium/ ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

1 revision by 1 contributors · full history

Suggest edit

What links here

AMD EPYC Venice AMD Helios AMD Instinct MI400 Broadcom Tomahawk 6 InfiniBand NVIDIA ConnectX UALink

Why did standard Ethernet need work for AI?

Who is in the Ultra Ethernet Consortium?

What is in the 1.0 specification?

How does Ultra Ethernet Transport work?

How does Ultra Ethernet differ from InfiniBand and UALink?

Adoption and significance

What are the limitations of Ultra Ethernet?

References

Improve this article

Related Articles

UALink

Cloud TPU

NVIDIA Picasso

Tensor Processing Unit (TPU)

TPU Pod

TPU Node

What links here

Related Articles

UALink

Cloud TPU

NVIDIA Picasso

Tensor Processing Unit (TPU)

TPU Pod

TPU Node

What links here