NVIDIA Feynman

AI Hardware NVIDIA

7 min read

Updated Jul 16, 2026

Suggest edit History Talk

RawGraph

Last edited

Jul 16, 2026

Fact-checked

In review queue

Sources

7 citations

Revision

v2 · 1,410 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

NVIDIA Feynman is the data center GPU architecture that NVIDIA has placed on its public roadmap as the successor to Rubin and Rubin Ultra, with a target of around 2028. It is named after the theoretical physicist Richard Feynman, continuing the company's practice of naming its accelerator architectures after scientists. Feynman was first revealed as a single line on a slide at GTC in March 2025, then fleshed out with more detail at GTC 2026 (held March 16 to 19, 2026 in San Jose), where Jensen Huang described it as the generation after Vera Rubin and introduced a new companion CPU named Rosa.^[1]^[2]^[3]

It is worth stating plainly up front: Feynman is a roadmap reveal, not a shipping product, and the public detail is thin. NVIDIA has confirmed a name, a rough year, a CPU partner, and a short list of technologies it expects to use. It has not published performance numbers, die counts, clock speeds, or final specifications, and much of the surrounding commentary in the trade press is informed speculation rather than disclosure. What follows separates the two as carefully as the available sources allow.

NVIDIA's annual data center cadence

Feynman only makes sense in the context of the release rhythm NVIDIA adopted as AI demand exploded. For most of its history the company shipped a new data center GPU architecture every two years or so. After the launch of Hopper and the H100 in 2022, Huang committed publicly to a roughly annual cadence, putting one new architecture or major refresh on the calendar each year.^[6]

The current sequence runs like this:

Generation	Target year	Companion CPU	Notable on the roadmap
Hopper (H100)	2022	Grace	Defined modern transformer training and inference
Blackwell (B200/GB200)	2024	Grace	Dual-die GPU package, FP4 inference
Blackwell Ultra (GB300)	2H 2025	Grace	Mid-cycle refresh, 288 GB HBM3E
Vera Rubin (VR200)	2026	Vera	New Rubin GPU plus Vera CPU, HBM4, NVLink 6
Rubin Ultra (VR300)	2027	Vera	NVL576 rack, NVLink 7, larger HBM stacks
Feynman	~2028	Rosa	3D die stacking, custom HBM, NVLink 8 with co-packaged optics

The pattern alternates between a brand new architecture and a mid-generation "Ultra" refresh that reuses the same CPU and platform while pushing memory and power higher. Feynman represents the next clean-sheet GPU step after the Rubin family.^[3]^[5]

What was revealed, and when

The Feynman name first appeared at GTC 2025 during Huang's keynote on March 18, 2025. At that point the roadmap slide showed Blackwell Ultra in the second half of 2025, Vera Rubin in 2026, Rubin Ultra in 2027, and Feynman in 2028. NVIDIA disclosed essentially nothing about Feynman beyond the name and the year, and at that stage the trade press assumed it would pair with the same Vera CPU used across the Rubin generation.^[1]^[5]

That changed at GTC 2026. In the updated data center roadmap, NVIDIA paired Feynman with a previously unannounced CPU called Rosa, and added a set of specific technologies to the Feynman column. Reporting from the event, along with NVIDIA's own GTC 2026 blog coverage, listed Feynman alongside the Rosa CPU, LP40 system memory, BlueField-5 data processing units, an eighth-generation NVLink (NVLink 8) switch fabric, Spectrum-class networking, and CX10 connectivity, with the NVLink switches incorporating co-packaged optics.^[2]^[3]^[4]

3D die stacking and custom HBM

The two architectural details NVIDIA emphasized for Feynman are 3D die stacking and custom high bandwidth memory.

3D die stacking means vertically layering GPU logic dies on top of one another rather than placing them side by side on an interposer, as Blackwell and Rubin do with their dual-die packages. If it ships as described, Feynman would be NVIDIA's first data center GPU built around stacked compute dies. The appeal is density and shorter interconnects between dies, which can raise bandwidth and cut latency inside the package. NVIDIA has not said how many dies it will stack or what process node it will use, and the engineering reality of stacking high-power logic (heat extraction in particular) is hard, so this is best read as a stated design direction rather than a finished implementation.^[2]^[3]

"Custom HBM" is the second headline. NVIDIA's roadmap lists a bespoke memory rather than an off-the-shelf next-generation standard. Across the industry this is widely understood to mean a memory generation beyond HBM4 and HBM4e, tailored to NVIDIA's packaging and likely co-designed with memory suppliers. Some coverage refers to it loosely as "HBM Next." The expectation in the trade press is that custom HBM would push per-package capacity and bandwidth well past the Rubin generation, but NVIDIA has published no capacity or bandwidth figures, so any specific number circulating online is speculative.^[2]^[3]

The Rosa CPU

Rosa is the CPU NVIDIA introduced at GTC 2026 to accompany Feynman, slotting into the position previously expected to be held by a continuation of the Vera CPU. It follows the same pattern as the Grace and Vera CPUs before it: a custom Arm-based processor designed to sit next to the GPU in an integrated superchip, feeding it data with high-bandwidth coherent links and handling the parts of a workload that run better on a general-purpose core.^[2]^[3]

The naming is where the sources disagree, and it is worth being honest about that. NVIDIA's own GTC 2026 coverage and several outlets report that Rosa is named after Rosalind Franklin, the chemist whose X-ray crystallography work was central to determining the structure of DNA. Other reporting, including Wikipedia's entry on the architecture, instead attributes the name to Rosalyn Sussman Yalow, the medical physicist who won the 1977 Nobel Prize for developing radioimmunoassay. NVIDIA's framing points to Franklin, and that is the more commonly cited attribution, but the discrepancy has not been fully resolved in public sources.^[2]^[3]^[7] Either way, the choice keeps Rosa within the company's tradition of honoring scientists, the same tradition that produced Hopper (Grace Hopper), Lovelace (Ada Lovelace), Blackwell (David Blackwell), and Rubin (the astronomer Vera Rubin).

Beyond its role and naming, almost nothing about Rosa is public: no core counts, no memory configuration, no clock targets. It is a name and a slot on a roadmap.

Why it matters

The strategic logic behind Feynman is less about any single chip and more about NVIDIA's attempt to keep a predictable, full-stack cadence that customers can plan multi-year data center buildouts around. By naming architectures, CPUs, switches, and networking parts years in advance, NVIDIA gives hyperscalers and AI labs a roadmap to commit capital against, and it pressures competitors to match a pace that historically ran on a two-year cycle.^[6]

The specific technologies pinned to Feynman also signal where NVIDIA thinks the bottlenecks are heading. Stacked dies and custom memory address the limits of how much compute and bandwidth fit in a single package. Co-packaged optics on the NVLink switch fabric address the limits of moving data between thousands of GPUs in a rack-scale system, where copper interconnects run out of reach and electrical signaling burns too much power. Together they describe a 2028-era machine built less like a board full of chips and more like one large, vertically integrated system, which is the language Huang has used to describe the Rubin generation as well.^[2]^[3]^[4]

For now, Feynman is a destination on a map. The pieces NVIDIA has committed to publicly (the Feynman GPU, the Rosa CPU, 3D stacking, custom HBM, NVLink 8 with optics) are real roadmap entries, repeated across NVIDIA's own materials and multiple independent outlets. The numbers that would let anyone judge how fast it actually is do not exist yet, and probably will not until much closer to launch.

References

Anton Shilov, "Nvidia announces Rubin GPUs in 2026, Rubin Ultra in 2027, Feynman after," Tom's Hardware, March 18, 2025. https://www.tomshardware.com/pc-components/gpus/nvidia-announces-rubin-gpus-in-2026-rubin-ultra-in-2027-feynam-after ↩
Anton Shilov, "Nvidia updates data center roadmap with Rosa CPU and stacked Feynman GPUs," Tom's Hardware, March 17, 2026. https://www.tomshardware.com/pc-components/gpus/nvidia-updates-data-center-roadmap-with-rosa-cpu-and-stacked-feynman-gpus-optical-nvlink-groq-lpus-with-nvfp4-and-nvlink-also-on-deck ↩
"NVIDIA updates roadmap, with new details on its next-gen GPU 'Feynman' coming in 2028," TweakTown, March 2026. https://www.tweaktown.com/news/110521/nvidia-updates-roadmap-with-new-details-on-its-next-gen-gpu-feynman-coming-in-2028/index.html ↩
"NVIDIA GTC 2026: Live Updates on What's Next in AI," NVIDIA Blog, March 2026. https://blogs.nvidia.com/blog/gtc-2026-news/ ↩
"GTC 2025: NVIDIA Unveils More on Rubin GPUs, Announces Feynman for 2028 in Roadmap Update," TrendForce, March 19, 2025. https://www.trendforce.com/news/2025/03/19/news-gtc-2025-nvidia-unveils-more-on-rubin-gpus-announces-feynman-for-2028-in-roadmap-update/ ↩
"Nvidia Reveals Next-Gen AI Chips, Roadmap Through 2028," Slashdot, March 18, 2025. https://tech.slashdot.org/story/25/03/18/201213/nvidia-reveals-next-gen-ai-chips-roadmap-through-2028 ↩
"Feynman (microarchitecture)," Wikipedia. https://en.wikipedia.org/wiki/Feynman_(microarchitecture) ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

1 revision by 1 contributors · full history

Suggest edit

What links here

NVIDIA Rubin Ultra

NVIDIA's annual data center cadence

What was revealed, and when

3D die stacking and custom HBM

The Rosa CPU

Why it matters

References

Improve this article

Related Articles

CuDNN

Jetson Thor

NVIDIA Blackwell

NVIDIA DGX Spark

NVIDIA Picasso

Jensen Huang