NVIDIA Feynman
Last reviewed
Jun 3, 2026
Sources
7 citations
Review status
Source-backed
Revision
v1 · 1,413 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 3, 2026
Sources
7 citations
Review status
Source-backed
Revision
v1 · 1,413 words
Add missing citations, update stale details, or suggest a clearer explanation.
NVIDIA Feynman is the data center GPU architecture that NVIDIA has placed on its public roadmap as the successor to Rubin and Rubin Ultra, with a target of around 2028. It is named after the theoretical physicist Richard Feynman, continuing the company's practice of naming its accelerator architectures after scientists. Feynman was first revealed as a single line on a slide at GTC in March 2025, then fleshed out with more detail at GTC 2026 (held March 16 to 19, 2026 in San Jose), where Jensen Huang described it as the generation after Vera Rubin and introduced a new companion CPU named Rosa.[1][2][3]
It is worth stating plainly up front: Feynman is a roadmap reveal, not a shipping product, and the public detail is thin. NVIDIA has confirmed a name, a rough year, a CPU partner, and a short list of technologies it expects to use. It has not published performance numbers, die counts, clock speeds, or final specifications, and much of the surrounding commentary in the trade press is informed speculation rather than disclosure. What follows separates the two as carefully as the available sources allow.
Feynman only makes sense in the context of the release rhythm NVIDIA adopted as AI demand exploded. For most of its history the company shipped a new data center GPU architecture every two years or so. After the launch of Hopper and the H100 in 2022, Huang committed publicly to a roughly annual cadence, putting one new architecture or major refresh on the calendar each year.[6]
The current sequence runs like this:
| Generation | Target year | Companion CPU | Notable on the roadmap |
|---|---|---|---|
| Hopper (H100) | 2022 | Grace | Defined modern transformer training and inference |
| Blackwell (B200/GB200) | 2024 | Grace | Dual-die GPU package, FP4 inference |
| Blackwell Ultra (GB300) | 2H 2025 | Grace | Mid-cycle refresh, 288 GB HBM3E |
| Vera Rubin (VR200) | 2026 | Vera | New Rubin GPU plus Vera CPU, HBM4, NVLink 6 |
| Rubin Ultra (VR300) | 2027 | Vera | NVL576 rack, NVLink 7, larger HBM stacks |
| Feynman | ~2028 | Rosa | 3D die stacking, custom HBM, NVLink 8 with co-packaged optics |
The pattern alternates between a brand new architecture and a mid-generation "Ultra" refresh that reuses the same CPU and platform while pushing memory and power higher. Feynman represents the next clean-sheet GPU step after the Rubin family.[3][5]
The Feynman name first appeared at GTC 2025 during Huang's keynote on March 18, 2025. At that point the roadmap slide showed Blackwell Ultra in the second half of 2025, Vera Rubin in 2026, Rubin Ultra in 2027, and Feynman in 2028. NVIDIA disclosed essentially nothing about Feynman beyond the name and the year, and at that stage the trade press assumed it would pair with the same Vera CPU used across the Rubin generation.[1][5]
That changed at GTC 2026. In the updated data center roadmap, NVIDIA paired Feynman with a previously unannounced CPU called Rosa, and added a set of specific technologies to the Feynman column. Reporting from the event, along with NVIDIA's own GTC 2026 blog coverage, listed Feynman alongside the Rosa CPU, LP40 system memory, BlueField-5 data processing units, an eighth-generation NVLink (NVLink 8) switch fabric, Spectrum-class networking, and CX10 connectivity, with the NVLink switches incorporating co-packaged optics.[2][3][4]
The two architectural details NVIDIA emphasized for Feynman are 3D die stacking and custom high bandwidth memory.
3D die stacking means vertically layering GPU logic dies on top of one another rather than placing them side by side on an interposer, as Blackwell and Rubin do with their dual-die packages. If it ships as described, Feynman would be NVIDIA's first data center GPU built around stacked compute dies. The appeal is density and shorter interconnects between dies, which can raise bandwidth and cut latency inside the package. NVIDIA has not said how many dies it will stack or what process node it will use, and the engineering reality of stacking high-power logic (heat extraction in particular) is hard, so this is best read as a stated design direction rather than a finished implementation.[2][3]
"Custom HBM" is the second headline. NVIDIA's roadmap lists a bespoke memory rather than an off-the-shelf next-generation standard. Across the industry this is widely understood to mean a memory generation beyond HBM4 and HBM4e, tailored to NVIDIA's packaging and likely co-designed with memory suppliers. Some coverage refers to it loosely as "HBM Next." The expectation in the trade press is that custom HBM would push per-package capacity and bandwidth well past the Rubin generation, but NVIDIA has published no capacity or bandwidth figures, so any specific number circulating online is speculative.[2][3]
Rosa is the CPU NVIDIA introduced at GTC 2026 to accompany Feynman, slotting into the position previously expected to be held by a continuation of the Vera CPU. It follows the same pattern as the Grace and Vera CPUs before it: a custom Arm-based processor designed to sit next to the GPU in an integrated superchip, feeding it data with high-bandwidth coherent links and handling the parts of a workload that run better on a general-purpose core.[2][3]
The naming is where the sources disagree, and it is worth being honest about that. NVIDIA's own GTC 2026 coverage and several outlets report that Rosa is named after Rosalind Franklin, the chemist whose X-ray crystallography work was central to determining the structure of DNA. Other reporting, including Wikipedia's entry on the architecture, instead attributes the name to Rosalyn Sussman Yalow, the medical physicist who won the 1977 Nobel Prize for developing radioimmunoassay. NVIDIA's framing points to Franklin, and that is the more commonly cited attribution, but the discrepancy has not been fully resolved in public sources.[2][3][7] Either way, the choice keeps Rosa within the company's tradition of honoring scientists, the same tradition that produced Hopper (Grace Hopper), Lovelace (Ada Lovelace), Blackwell (David Blackwell), and Rubin (the astronomer Vera Rubin).
Beyond its role and naming, almost nothing about Rosa is public: no core counts, no memory configuration, no clock targets. It is a name and a slot on a roadmap.
The strategic logic behind Feynman is less about any single chip and more about NVIDIA's attempt to keep a predictable, full-stack cadence that customers can plan multi-year data center buildouts around. By naming architectures, CPUs, switches, and networking parts years in advance, NVIDIA gives hyperscalers and AI labs a roadmap to commit capital against, and it pressures competitors to match a pace that historically ran on a two-year cycle.[6]
The specific technologies pinned to Feynman also signal where NVIDIA thinks the bottlenecks are heading. Stacked dies and custom memory address the limits of how much compute and bandwidth fit in a single package. Co-packaged optics on the NVLink switch fabric address the limits of moving data between thousands of GPUs in a rack-scale system, where copper interconnects run out of reach and electrical signaling burns too much power. Together they describe a 2028-era machine built less like a board full of chips and more like one large, vertically integrated system, which is the language Huang has used to describe the Rubin generation as well.[2][3][4]
For now, Feynman is a destination on a map. The pieces NVIDIA has committed to publicly (the Feynman GPU, the Rosa CPU, 3D stacking, custom HBM, NVLink 8 with optics) are real roadmap entries, repeated across NVIDIA's own materials and multiple independent outlets. The numbers that would let anyone judge how fast it actually is do not exist yet, and probably will not until much closer to launch.