NVIDIA Spectrum-X

NVIDIA Spectrum-X is an Ethernet-based networking platform developed by NVIDIA specifically for large-scale AI workloads. It pairs the Spectrum-4 Ethernet switch with the BlueField-3 SuperNIC to deliver congestion control, adaptive routing, and performance isolation that standard Ethernet cannot provide at AI cluster scale. Announced at Computex in May 2023 and first deployed in the Israel-1 supercomputer later that year, Spectrum-X has since become the networking fabric for some of the largest AI clusters in the world, including xAI's Colossus supercomputer in Memphis, Tennessee, and systems at Meta, Oracle, and CoreWeave.

By 2025, NVIDIA had surpassed both Arista and Cisco in datacenter Ethernet market share, with Spectrum-X accounting for a significant portion of that growth. The platform addresses a practical problem: conventional Ethernet, designed for general-purpose traffic, performs poorly when thousands of GPUs simultaneously exchange gradients during distributed AI training. Spectrum-X brings a set of InfiniBand-derived techniques to Ethernet, allowing operators to stay within the open Ethernet ecosystem while closing much of the performance gap with InfiniBand fabrics.

Background: why standard Ethernet falls short for AI

Distributed AI training relies on collective communication operations, particularly all-reduce, which synchronizes gradient updates across every GPU in a cluster after each training step. In a 100,000-GPU cluster, this means tens of thousands of simultaneous flows converge on the same switch ports in a pattern called incast. Conventional Ethernet handles incast poorly: queues fill, packets are dropped, and TCP or RoCE senders back off. The result is that switches run at 50 to 60 percent of their theoretical bandwidth, and tail latency spikes unpredictably.

InfiniBand avoided this problem through lossless transport, fine-grained congestion signals, and in-network computing via SHARP (Scalable Hierarchical Aggregation and Reduction Protocol). But InfiniBand is a proprietary fabric: cabling, optics, switches, and network interface cards must all come from NVIDIA/Mellanox, and the ecosystem is less commoditized than Ethernet.

Hyperscale operators, particularly those building multi-tenant AI clouds, wanted the interoperability and vendor flexibility of Ethernet with performance closer to InfiniBand. Spectrum-X was NVIDIA's answer to that requirement.

Architecture

Spectrum-4 switch (SN5600)

The primary switch in the Spectrum-X platform is the SN5600, based on the Spectrum-4 ASIC. The SN5600 is a 2U rack-mounted switch with 64 OSFP ports running at 800 GbE each, for a total switching capacity of 51.2 Tbps. Packet forwarding reaches 33.3 billion packets per second. The switch includes a 160 MB shared packet buffer and an Intel Xeon Hexa-core processor for control-plane processing. Two hot-swap power supplies are included, drawing approximately 940 watts under load.

The Spectrum-4 ASIC is responsible for several functions that commodity Ethernet switches do not implement: per-packet adaptive routing decisions, real-time egress queue telemetry, and congestion notification generation. These capabilities allow the switch to respond to congestion in microseconds rather than milliseconds.

In a standard two-tier leaf-spine topology, a single layer of SN5600 switches can connect up to 16,000 ports at 200 GbE or 4,096 ports at 800 GbE. The Spectrum-X Multiplane configuration, introduced in 2024 and 2025, can scale a flat two-tier fabric to 128,000 GPUs.

BlueField-3 SuperNIC

The endpoint device in Spectrum-X is the BlueField-3 SuperNIC, which connects each GPU server to the fabric at up to 400 Gb/s using RDMA over Converged Ethernet (RoCE). Unlike a conventional network interface card, the BlueField-3 includes an Arm-based multi-core processor complex that runs congestion control algorithms in the NIC itself, offloading this work from the host CPU.

The BlueField-3 can handle millions of congestion control events per second at microsecond reaction latency. It also performs packet reordering at the receive side, which is necessary because Spectrum-X's per-packet adaptive routing can deliver packets out of order. By reordering packets in the NIC before presenting them to the RDMA layer, the BlueField-3 makes adaptive routing transparent to applications.

The BlueField-3 supports quality-of-service isolation, which allows different AI training jobs running on shared infrastructure to be separated at the network layer so that one noisy workload does not affect the latency of another.

ConnectX-8 SuperNIC (800 Gb/s, 2025)

In 2025, NVIDIA introduced the ConnectX-8 SuperNIC to support the Blackwell GPU generation, including the NVIDIA GB300 NVL72. The ConnectX-8 doubles the per-port bandwidth of the BlueField-3, supporting 800 Gb/s on a single OSFP port or 2x 400 GbE in dual-port configurations.

The ConnectX-8 uses PCIe Gen 6, with a 32-lane bus split into two 16-lane connections for dual-host configurations in dense GPU systems. It retains the in-NIC congestion control architecture from the BlueField-3 and adds support for SHARP v4, which allows in-network all-reduce computations to run on Spectrum-X Ethernet fabrics. This is a notable change: earlier Spectrum-X generations left SHARP primarily as an InfiniBand feature, while the ConnectX-8 brings in-network computing to the Ethernet side of NVIDIA's portfolio.

The ConnectX-8 also integrates with the Spectrum-X telemetry infrastructure, providing end-to-end visibility from NIC to switch to allow fine-grained monitoring of flow behavior across a large fabric.

Adaptive routing

Conventional Ethernet load-balancing uses Equal-Cost Multi-Path (ECMP) routing, which assigns each flow to a path at connection setup time and keeps it there. Under incast conditions, ECMP produces severe imbalance: some paths fill while others remain idle, because flow assignments are made without knowledge of instantaneous queue depth.

Spectrum-X uses RoCE Adaptive Routing, which makes routing decisions on a per-packet basis. The Spectrum-4 switch evaluates the congestion state of each egress port in real time and selects the least-loaded port for each packet. Switches also exchange congestion status with neighboring switches so that routing decisions account for downstream congestion, not just local queue depth.

Because per-packet routing can deliver packets out of order, the BlueField-3 and ConnectX-8 SuperNICs perform receive-side reordering before handing data to the RDMA layer. Applications see in-order delivery and are unaware that individual packets took different paths through the fabric.

The effect on bandwidth utilization is substantial. Standard ECMP typically achieves 50 to 60 percent effective bandwidth utilization in AI training workloads, where flow patterns are highly synchronized and collisions are common. NVIDIA reports that Spectrum-X adaptive routing raises effective utilization to over 97 percent, confirmed in production at xAI's Colossus cluster, which maintained 95 percent data throughput across a 100,000-GPU system with zero packet loss due to flow collisions.

Lossless Ethernet for AI

RDMA over Converged Ethernet requires a lossless network. Unlike TCP, RoCE does not retransmit dropped packets through its own transport layer; a dropped RoCE packet causes the entire RDMA operation to fail, forcing a restart of the collective communication. At scale, even low packet loss rates translate into significant throughput degradation.

Standard lossless Ethernet uses Priority Flow Control (PFC), which allows a congested receiver to pause a sender. PFC has a well-known problem: a pause frame sent to one sender can cascade backward through the network, creating head-of-line blocking for unrelated flows on the same priority class. In a large cluster, PFC cascades can stall large portions of the fabric.

Spectrum-X achieves losslessness through a different mechanism. The Spectrum-4 switch generates in-band congestion notification (CN) signals that are delivered directly to the BlueField-3 or ConnectX-8 SuperNIC handling the sending flow. The NIC reduces its sending rate before queues fill to the point where PFC would trigger. By keeping queues below the PFC threshold, Spectrum-X avoids PFC cascades entirely in normal operation, using PFC only as a last-resort backstop.

This combination, proactive ECN-based rate reduction in the NIC plus per-packet adaptive routing in the switch, allows Spectrum-X to maintain lossless behavior at scale without the instability that PFC cascades introduce in naive lossless Ethernet deployments.

Spectrum-X Photonics and co-packaged optics

At GTC in March 2025, NVIDIA announced Spectrum-X Photonics, which integrates silicon photonics directly onto the switch package. This co-packaged optics (CPO) approach eliminates the pluggable transceiver module and the short electrical link between the switch ASIC and the optics cage, reducing insertion loss and power consumption.

Spectrum-X Photonics switches are available in configurations of 128 ports at 800 Gb/s (100 Tb/s total) or 512 ports at 800 Gb/s (400 Tb/s total). NVIDIA cited a 5x improvement in power efficiency per bit, 10x better signal integrity, and 10x higher resiliency at scale compared to conventional pluggable transceiver networks. The Spectrum-X Photonics switches were scheduled to become available from infrastructure vendors in 2026.

Spectrum-XGS: extending across distributed data centers

Announced at Hot Chips in August 2025, Spectrum-XGS is an extension of the Spectrum-X platform designed for multi-data-center AI training. As individual data center sites reach power limits, operators are building AI clusters that span multiple buildings, campuses, or geographic locations. Standard Ethernet suffers from increased latency, jitter, and unpredictable performance over longer distances.

Spectrum-XGS adds distance-aware congestion control algorithms that adapt to the round-trip time between facilities. It includes precision latency management and end-to-end telemetry across the inter-facility links. NVIDIA reported that Spectrum-XGS nearly doubles NCCL throughput in cross-data-center environments compared to standard Ethernet, specifically citing a 1.9x improvement in NCCL collective performance.

CoreWeave announced it would be among the first operators to deploy Spectrum-XGS to interconnect its distributed data centers.

Comparison with Quantum-2 InfiniBand

NVIDIA's Quantum series is its InfiniBand platform, which predates Spectrum-X and remains in wide use for high-performance computing. The table below compares Spectrum-X (Spectrum-4 / ConnectX-8) against Quantum-2 (NDR InfiniBand, QM9700 switch / ConnectX-7 HCA).

Feature	Spectrum-X (Ethernet)	Quantum-2 NDR (InfiniBand)
Max port speed	800 GbE (ConnectX-8)	400 Gb/s NDR
Switch fabric capacity	51.2 Tbps (SN5600)	57.6 Tbps (QM9700)
Port count per switch	64x 800G	64x 400G NDR
Latency (8B message)	~1.7 us (RoCE)	~0.9 us
Protocol	Ethernet (RoCE v2)	InfiniBand (native RDMA)
In-network computing	SHARP v4 (ConnectX-8)	SHARP v3 (full support)
Congestion control	NIC-driven, per-packet adaptive routing	Native IB CC, per-packet adaptive routing
Lossless mechanism	ECN + proactive NIC rate reduction	Native lossless transport
Multi-tenancy / isolation	Yes (QoS isolation)	Limited
Ecosystem	Open Ethernet (SONiC, Cumulus, OCP)	Proprietary IB ecosystem
Primary use case	AI clouds, multi-tenant factories	AI-dedicated HPC, supercomputers
Notable deployments	xAI Colossus, Meta, Oracle	NVIDIA DGX SuperPOD, many TOP500 systems

The latency advantage of InfiniBand (approximately 0.9 microseconds vs. 1.7 microseconds for RoCE) remains relevant for very tightly coupled HPC workloads, but for large-scale LLM training the all-reduce collective operations that dominate training time are less sensitive to raw latency than to sustained bandwidth. Meta's networking engineers concluded after extensive tuning that RoCE and InfiniBand provided equivalent performance for their largest model training runs, which was a significant validation for Spectrum-X.

The Quantum-X800 InfiniBand platform announced in 2024 further raises the bar: the Q3400 switch provides 144 ports at 800 Gb/s for 115.2 Tbps of total fabric capacity, compared to 51.2 Tbps for the SN5600. The Q3400 also delivers 5x more SHARP in-network computing capacity than the previous generation. For AI-dedicated infrastructure where performance per dollar justifies the proprietary ecosystem, Quantum-X800 retains a clear lead.

Comparison with Arista and Cisco AI Ethernet

NVIDIA Spectrum-X competes with Ethernet solutions from Arista Networks and Cisco in the AI data center market.

Feature	NVIDIA Spectrum-X	Arista 7060X series	Cisco N9100
Switch ASIC	NVIDIA Spectrum-4	Broadcom Tomahawk 5	NVIDIA Spectrum-4 (licensed)
Max port speed	800 GbE	800 GbE	800 GbE
Switching capacity (per switch)	51.2 Tbps	51.2 Tbps	51.2 Tbps
AI-specific congestion control	Yes (NIC+switch integrated)	Partial (DSCP-based ECN)	Yes (via Spectrum-4 ASIC)
Adaptive routing	Per-packet (with BlueField/CX-8 NIC)	Flowlet ECMP (ETA)	Per-packet (with BlueField/CX-8)
Requires NVIDIA NIC	Yes	No	No (but recommended)
Network OS	SONiC, NVIDIA Cumulus	EOS	NX-OS, SONiC
OCP participation	Yes	Yes	Yes
AI performance vs. OTS Ethernet	1.6x (NVIDIA claim)	Varies	Varies

Arista's primary differentiation in AI networking is EOS, its network operating system, which has deep programmability and a large installed base in enterprise and cloud data centers. Arista's 7060X switches use Broadcom's Tomahawk 5 ASIC, which provides competitive port density but lacks the NIC-integrated congestion control that Spectrum-X provides through BlueField-3 or ConnectX-8.

Cisco's N9100 series, announced in early 2025, takes a different approach: it uses the NVIDIA Spectrum-4 switch ASIC under license, gaining Spectrum-X's hardware congestion control capabilities, while running Cisco's NX-OS or SONiC operating system. This makes the Cisco N9100 compatible with Spectrum-X-enabled NICs and reference architectures. The collaboration between Cisco and NVIDIA, announced in February 2025, includes joint reference architectures for AI cluster deployments.

By Q2 2025, NVIDIA had surpassed Arista to lead the datacenter Ethernet switch segment with 25.9 percent market share, up from essentially zero in 2022. Arista held 18.9 percent and Cisco maintained 27.3 percent of the total Ethernet market. This shift reflected AI clusters deploying Spectrum-X at scale.

Major deployments

Israel-1 supercomputer

Israel-1 was the first large-scale deployment of Spectrum-X. Built at a data center in Israel, the system comprised 128 NVIDIA HGX H100 servers (with eight H100 GPUs each), 1,280 BlueField-3 DPUs, and more than 40 Spectrum-4 switches. The first phase went online in November 2023, two months ahead of its planned schedule, having been assembled in approximately 20 weeks. The complete system was designed to deliver eight exaflops of peak AI performance and 130 petaflops for high-performance computing. Israel-1 served as both a production AI system and a testbed for Spectrum-X in a real data center environment.

xAI Colossus

xAI's Colossus cluster in Memphis, Tennessee, built using Spectrum-X, became one of the most closely watched Spectrum-X deployments. The cluster was assembled in 122 days and reached 100,000 NVIDIA Hopper (NVIDIA B200 class) GPUs in its initial configuration, with plans to expand to 200,000 total GPUs. The RDMA fabric uses Spectrum-X SN5600 switches paired with BlueField-3 SuperNICs.

xAI reported that across all three tiers of the Colossus network fabric, the system experienced zero application latency degradation or packet loss due to flow collisions, and maintained 95 percent data throughput throughout production training runs on Grok models. By comparison, standard Ethernet at similar scale typically produces thousands of flow collisions and delivers around 60 percent data throughput. The Colossus deployment was described by NVIDIA as a direct validation of Spectrum-X performance at the scale of the world's largest AI supercomputer.

Oracle

Oracle announced at OCP Summit in October 2025 that it is building giga-scale AI supercomputers using Spectrum-X Ethernet switches within Oracle Cloud Infrastructure (OCI). Oracle's planned systems, based on the NVIDIA Vera Rubin GPU architecture, will use Spectrum-X for the GPU-to-GPU communication fabric. Oracle cited Spectrum-X's congestion control and scale as key factors in its choice.

CoreWeave

CoreWeave, a GPU cloud provider that has become one of the largest users of NVIDIA hardware, operates Spectrum-X fabrics across multiple data centers. In July 2025, CoreWeave became the first company to commercially deploy NVIDIA Blackwell Ultra GPUs (NVIDIA GB300 NVL72) with Spectrum-X networking. CoreWeave also announced plans to deploy Spectrum-XGS to interconnect its geographically distributed data centers into a unified AI training fabric.

Stargate Initiative

The Stargate Initiative, the US government-backed AI infrastructure program announced in early 2025, lists NVIDIA as a core technology partner. Stargate data centers are expected to use NVIDIA networking including Spectrum-X for the AI training and inference fabric, representing a potential large-scale government deployment of the platform.

Spectrum-X versus InfiniBand: tradeoffs in practice

The choice between Spectrum-X Ethernet and Quantum InfiniBand involves several practical considerations beyond raw performance numbers.

Ecosystem openness is the primary advantage of Spectrum-X. Ethernet cabling, optics, and management tooling are available from many vendors. Standard network operating systems including SONiC and Cumulus Linux run on Spectrum-4 switches. This reduces vendor lock-in and allows operators to use existing Ethernet infrastructure investments. InfiniBand, by contrast, uses a proprietary physical layer and requires NVIDIA switches, cables, and host adapters throughout.

Multi-tenancy is another area where Spectrum-X is better suited than InfiniBand. The QoS isolation features in the BlueField-3 and ConnectX-8 SuperNICs allow multiple AI training jobs to share the same fabric without interfering with each other. InfiniBand networks typically require dedicated partitions for isolation, which reduces flexibility in cloud environments where job mixes change continuously.

Latency still favors InfiniBand. NDR InfiniBand achieves approximately 0.9 microsecond message latency at small message sizes, while Spectrum-X RoCE runs at approximately 1.7 microseconds. For distributed training of large language models, where all-reduce operations involve messages on the order of gigabytes, this latency difference is not the primary bottleneck. For traditional HPC workloads with fine-grained synchronization, InfiniBand's latency advantage remains meaningful.

In-network computing via SHARP has historically been a clear InfiniBand advantage. SHARP offloads all-reduce operations into the switch fabric itself, reducing the number of data copies and the amount of host-side compute needed for collective operations. With the ConnectX-8 SuperNIC, NVIDIA has brought SHARP v4 support to the Ethernet side, though InfiniBand's Quantum-X800 still offers more switch-side SHARP capacity.

Operational complexity is a real consideration with Spectrum-X. Achieving full performance requires correct configuration of PFC thresholds, ECN marking, QoS classes, and adaptive routing parameters. NVIDIA and its partners provide reference configurations, but deploying a properly tuned Spectrum-X fabric requires networking expertise that goes beyond standard Ethernet administration. Estimates suggest 2 to 3 weeks of expert tuning time for initial deployment.

Scaling ceilings have narrowed. Spectrum-X Multiplane configurations can reach 128,000 GPUs in a flat two-tier topology. InfiniBand with fat-tree or dragonfly topologies can scale to similar counts. At 100,000-GPU scale, as demonstrated at xAI Colossus, Spectrum-X has proven adequate for production AI training.

Use cases

Spectrum-X is most commonly deployed in three scenarios.

Large-scale LLM training is the primary use case. Pre-training runs for models like GPT-4-class systems and beyond involve all-reduce operations across tens of thousands of GPUs every few hundred milliseconds. The combination of lossless Ethernet and adaptive routing that Spectrum-X provides directly addresses the bottlenecks that degrade training throughput on standard Ethernet.

Multi-tenant AI clouds are a second use case where Spectrum-X has advantages. Cloud providers running many different AI training jobs simultaneously on shared GPU clusters benefit from the performance isolation features. A workload that causes incast on one tenant's traffic does not spill over into other tenants' flows, which is difficult to guarantee on standard Ethernet or even on InfiniBand without dedicated partitions.

AI inference at scale is a growing use case. Large-scale inference clusters serving inference requests for frontier models require efficient GPU-to-GPU communication for tensor parallelism and pipeline parallelism. The latency and throughput characteristics of Spectrum-X are suited to these workloads, and the ability to share infrastructure between training and inference on the same Ethernet fabric is operationally attractive.

Limitations

Spectrum-X is not a drop-in replacement for standard Ethernet. It requires NVIDIA BlueField or ConnectX-8 SuperNICs at every endpoint. Servers connected with other vendors' NICs cannot participate in adaptive routing or NIC-driven congestion control, meaning they operate at standard Ethernet performance levels on the same fabric.

The platform's performance depends on the tight integration between the Spectrum-4 switch and the NVIDIA NIC. This integration is partly opaque: the specific algorithms for congestion notification, routing decisions, and rate adaptation are implemented in NVIDIA hardware and firmware, not in an open standard. The Ultra Ethernet Consortium is working on open specifications for similar capabilities, but those standards were still in development as of mid-2025.

Performance headroom compared to InfiniBand remains at the extreme end of the scale. For clusters of 32,000 GPUs or more running the most latency-sensitive collective operations, InfiniBand NDR still achieves higher throughput in controlled benchmarks. The gap narrows with careful tuning, and for LLM training workloads specifically the difference is small enough that Meta judged the two fabrics equivalent. For traditional scientific HPC, InfiniBand remains the default choice.

The Spectrum-X Photonics co-packaged optics switches planned for 2026 have not yet shipped at the time of writing (May 2026), and field experience with CPO in high-density AI clusters remains limited.

Finally, NVIDIA's Spectrum-X1600, the next-generation switch ASIC expected in the second half of 2026, will provide 102.4 Tbps of switching capacity, matching Broadcom's Tomahawk 6 that shipped in 2025. Until the Spectrum-X1600 ships, NVIDIA's per-switch capacity trails Broadcom's latest generation by approximately one year.

References

NVIDIA Newsroom. "NVIDIA Launches Accelerated Ethernet Platform for Hyperscale Generative AI." May 28, 2023. https://nvidianews.nvidia.com/news/nvidia-launches-accelerated-ethernet-platform-for-hyperscale-generative-ai
NVIDIA Newsroom. "NVIDIA's New Ethernet Networking Platform for AI Available Soon From Dell Technologies, Hewlett Packard Enterprise, Lenovo." November 2023. https://nvidianews.nvidia.com/news/nvidias-new-ethernet-networking-platform-for-ai-available-soon-from-dell-technologies-hewlett-packard-enterprise-lenovo
NVIDIA Technical Blog. "Turbocharging Generative AI Workloads with NVIDIA Spectrum-X Networking Platform." https://developer.nvidia.com/blog/turbocharging-ai-workloads-with-nvidia-spectrum-x-networking-platform/
NVIDIA Technical Blog. "Optimize Large-Scale AI Workloads with NVIDIA Spectrum-X." https://developer.nvidia.com/blog/optimize-large-scale-ai-workloads-with-nvidia-spectrum-x/
Data Center Dynamics. "Nvidia's Israel-1 supercomputer starts operations." November 2023. https://www.datacenterdynamics.com/en/news/nvidias-israel-1-supercomputer-starts-operations/
NVIDIA Newsroom. "NVIDIA Ethernet Networking Accelerates World's Largest AI Supercomputer, Built by xAI." 2024. https://nvidianews.nvidia.com/news/spectrum-x-ethernet-networking-xai-colossus
Data Center Dynamics. "xAI to double Colossus compute capacity, reveals cluster uses Nvidia Spectrum-X ethernet." 2024. https://www.datacenterdynamics.com/en/news/xai-to-double-colossus-compute-capacity-reveals-cluster-uses-nvidia-spectrum-x-ethernet/
NVIDIA Newsroom. "NVIDIA Spectrum-X Ethernet Switches Speed Up Networks for Meta and Oracle." October 13, 2025. https://nvidianews.nvidia.com/news/nvidia-spectrum-x-ethernet-switches-speed-up-networks-for-meta-and-oracle
NVIDIA Newsroom. "NVIDIA Introduces Spectrum-XGS Ethernet to Connect Distributed Data Centers Into Giga-Scale AI Super-Factories." August 22, 2025. https://nvidianews.nvidia.com/news/nvidia-introduces-spectrum-xgs-ethernet-to-connect-distributed-data-centers-into-giga-scale-ai-super-factories
NVIDIA Newsroom. "NVIDIA Announces Spectrum-X Photonics, Co-Packaged Optics Networking Switches to Scale AI Factories to Millions of GPUs." March 18, 2025. https://nvidianews.nvidia.com/news/nvidia-spectrum-x-co-packaged-optics-networking-switches-ai-factories
NVIDIA Docs. "ConnectX-8 SuperNIC Specifications." https://docs.nvidia.com/networking/display/connectx8supernic/specifications
NVIDIA Docs. "Spectrum SN5000 Series Switches." https://docs.nvidia.com/networking/display/sn5000
Next Platform. "Nvidia Passes Cisco And Rivals Arista In Datacenter Ethernet Sales." June 23, 2025. https://www.nextplatform.com/2025/06/23/nvidia-passes-cisco-and-rivals-arista-in-datacenter-ethernet-sales/
Cisco Newsroom. "Cisco Delivers AI Innovations across Neocloud, Enterprise and Telecom with NVIDIA." October 2025. https://newsroom.cisco.com/c/r/newsroom/en/us/a/y2025/m10/cisco-delivers-ai-networking-innovations-across-neocloud-enterprise-and-telecom-with-nvidia.html
Hyperframe Research. "Hot Chips 2025: NVIDIA Ushers in the Scale Across Giga-Scale AI Super-Factories Era." August 2025. https://hyperframeresearch.com/2025/08/27/hot-chips-2025-nvidia-ushers-in-the-scale-across-giga-scale-ai-super-factories-era/
NVIDIA Technical Blog. "Benchmarking NVIDIA Spectrum-X for AI Network Performance, Now Available from Supermicro." https://developer.nvidia.com/blog/benchmarking-nvidia-spectrum-x-for-ai-network-performance-now-available-from-supermicro/
Spheron Blog. "GPU Networking for AI Clusters: InfiniBand vs RoCE vs Spectrum-X Decision Guide (2026)." https://www.spheron.network/blog/gpu-networking-infiniband-roce-spectrum-x-guide/
NVIDIA Networking. "Spectrum-X Ethernet Platform." https://www.nvidia.com/en-us/networking/spectrumx/

NVIDIA Spectrum-X

Background: why standard Ethernet falls short for AI

Architecture

Spectrum-4 switch (SN5600)

BlueField-3 SuperNIC

ConnectX-8 SuperNIC (800 Gb/s, 2025)

Adaptive routing

Lossless Ethernet for AI

Spectrum-X Photonics and co-packaged optics

Spectrum-XGS: extending across distributed data centers

Comparison with Quantum-2 InfiniBand

Comparison with Arista and Cisco AI Ethernet

Major deployments

Israel-1 supercomputer

xAI Colossus

Meta

Oracle

CoreWeave

Stargate Initiative

Spectrum-X versus InfiniBand: tradeoffs in practice

Use cases

Limitations

See also

References

Improve this article

NVIDIA Spectrum-X

Background: why standard Ethernet falls short for AI

Architecture

Spectrum-4 switch (SN5600)

BlueField-3 SuperNIC

ConnectX-8 SuperNIC (800 Gb/s, 2025)

Adaptive routing

Lossless Ethernet for AI

Spectrum-X Photonics and co-packaged optics

Spectrum-XGS: extending across distributed data centers

Comparison with Quantum-2 InfiniBand

Comparison with Arista and Cisco AI Ethernet

Major deployments

Israel-1 supercomputer

xAI Colossus

Meta

Oracle

CoreWeave

Stargate Initiative

Spectrum-X versus InfiniBand: tradeoffs in practice

Use cases

Limitations

See also

References