Ada Lovelace (microarchitecture)
Last reviewed
Jun 3, 2026
Sources
9 citations
Review status
Source-backed
Revision
v1 · 1,977 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 3, 2026
Sources
9 citations
Review status
Source-backed
Revision
v1 · 1,977 words
Add missing citations, update stale details, or suggest a clearer explanation.
Ada Lovelace is a graphics processing unit (GPU) microarchitecture developed by Nvidia and announced in September 2022. It succeeded the consumer side of the Ampere architecture and powers the GeForce RTX 40 series of consumer graphics cards, the RTX 6000 Ada Generation workstation card, and a family of data center accelerators that includes the L40, L40S, and L4. Fabricated on a custom TSMC "4N" process, Ada Lovelace introduced fourth-generation Tensor Cores with FP8 support, third-generation ray tracing (RT) cores, a greatly enlarged on-chip cache, and the DLSS 3 frame-generation technique. Within Nvidia's product lineup it is positioned for graphics and AI inference, complementing the Hopper data center architecture of the same era, which targets large-scale AI training.
The architecture is named after Ada Lovelace (1815 to 1852), the English mathematician and writer often credited as the first computer programmer for her notes on Charles Babbage's proposed Analytical Engine, which included an algorithm intended to be carried out by the machine. The choice continues Nvidia's practice of naming GPU architectures after scientists and mathematicians, following Turing, Ampere, and Hopper. Nvidia and the technical press most often refer to the architecture simply as "Ada Lovelace" or "Ada."[1]
Nvidia chief executive Jensen Huang unveiled the Ada Lovelace architecture during a GTC keynote on September 20, 2022, where he introduced the first GeForce RTX 40 series products.[2] The flagship GeForce RTX 4090 went on sale on October 12, 2022, at a launch price of 1,599 US dollars, followed by the GeForce RTX 4080 in November 2022.[3] Data center and professional parts were introduced over the following year: the L40 and the RTX 6000 Ada Generation were announced alongside the consumer launch in 2022, the L4 followed at GTC in March 2023, and the L40S was announced at SIGGRAPH in August 2023.[4][5]
Ada Lovelace is implemented across a family of dies designated AD102 through AD107, scaling from the flagship desktop part down to entry-level and mobile products. The largest, AD102, contains 76.3 billion transistors on a die of roughly 608 mm2, more than two and a half times the transistor count of the equivalent Ampere die (GA102).[1][6] A fully enabled AD102 provides 18,432 CUDA cores organized into 144 streaming multiprocessors (SMs), with 144 third-generation RT cores and 576 fourth-generation Tensor Cores. Each Ada SM contains 128 CUDA cores, one RT core, four Tensor Cores, and 128 KB of combined shared memory and L1 cache.[1]
The table below summarizes the principal Ada Lovelace dies and the consumer products built on them.
| Die | Transistors | Die size | L2 cache | Example GeForce products |
|---|---|---|---|---|
| AD102 | 76.3 billion | ~608 mm2 | 96 MB | RTX 4090 |
| AD103 | 45.9 billion | ~379 mm2 | 64 MB | RTX 4080, RTX 4080 SUPER |
| AD104 | 35.8 billion | ~294 mm2 | 48 MB | RTX 4070 Ti, RTX 4070 |
| AD106 | 22.9 billion | ~188 mm2 | 32 MB | RTX 4060 Ti |
| AD107 | 18.9 billion | ~159 mm2 | (smaller) | RTX 4060 |
The GeForce RTX 4090, the highest-end consumer card, uses a cut-down AD102 with 16,384 CUDA cores, 24 GB of GDDR6X memory on a 384-bit bus delivering about 1,008 GB/s of bandwidth, and a rated total board power of 450 W. Nvidia quoted roughly 82.6 teraFLOPS of FP32 shader throughput for the card.[3][6] The RTX 4080 is built on AD103, and the RTX 4070 family on AD104.[1]
One of the most consequential changes in Ada Lovelace is a dramatically larger second-level (L2) cache. The full AD102 carries 96 MB of L2 cache, compared with about 6 MB on the Ampere GA102, an increase of roughly 16 times.[6] The larger cache reduces traffic to external memory, which helps Ada parts deliver high effective bandwidth and high clock speeds despite continuing to use conventional GDDR memory rather than high-bandwidth memory (HBM). The approach is conceptually similar to the large on-die caches adopted elsewhere in the industry to mitigate the growing gap between compute throughput and memory bandwidth.
Ada Lovelace introduced fourth-generation Tensor Cores, the matrix-math units that accelerate AI workloads. The key addition is support for the 8-bit floating-point (FP8) data format through a Transformer Engine, a feature first introduced in the Hopper H100 data center GPU.[7] FP8 roughly doubles arithmetic throughput relative to FP16 for compatible models while keeping accuracy acceptable for many inference tasks, and it can be combined with structural sparsity for further gains. Nvidia states that Ada's Tensor Cores increase throughput by up to five times relative to the previous generation, reaching up to 1.4 Tensor-petaFLOPS of FP8 performance on the highest-end parts.[7] This capability is central to the architecture's role in generative AI inference, where it is commonly paired with Nvidia's TensorRT inference software.
The third-generation RT cores roughly double ray-triangle intersection throughput compared with Ampere, increasing ray tracing performance by more than two times.[7] Two new fixed-function units accompany them. The Opacity Micromap (OMM) Engine accelerates ray tracing of alpha-tested geometry such as foliage, fences, and particles, which previously forced expensive shader work. The Displaced Micro-Mesh (DMM) Engine accelerates construction of the bounding volume hierarchy (BVH) used to organize scene geometry, delivering up to ten times faster BVH build times while using up to twenty times less BVH storage for highly detailed meshes.[7]
Ada introduced Shader Execution Reordering (SER), a scheduling technology that dynamically regroups divergent ray tracing work so that similar shading operations execute together. Nvidia reports that SER can improve shader performance for ray tracing by up to three times and raise in-game frame rates by up to 25 percent.[7]
Ada Lovelace launched with DLSS 3, which extends Nvidia's Deep Learning Super Sampling upscaling with Frame Generation. Frame Generation uses a dedicated Optical Flow Accelerator together with the fourth-generation Tensor Cores to synthesize entirely new intermediate frames between rendered frames, increasing perceived frame rates. Because it depends on the Optical Flow Accelerator and the new Tensor Cores, full DLSS 3 Frame Generation is exclusive to GeForce RTX 40 series GPUs, while the upscaling and reconstruction components of DLSS remain available on earlier RTX generations.[7]
Ada Lovelace incorporates an eighth-generation NVIDIA Encoder (NVENC) that adds hardware AV1 encoding, which Nvidia describes as roughly 40 percent more efficient than H.264. Higher-end Ada GPUs include dual NVENC encoders that can split a single encoding job to roughly halve export times or encode multiple streams in parallel, a capability that is also valuable for the video-processing workloads targeted by the data center parts.[7]
While the GeForce parts target gaming and content creation, Nvidia derived a separate line of professional and data center accelerators from the same architecture, all built on the AD102 or AD104 die and all using error-correcting (ECC) GDDR6 memory rather than the GDDR6X found on consumer flagships. These parts emphasize 24-hour duty cycles, passive cooling for server chassis, and AI inference and graphics throughput rather than the highest gaming clocks.
The L40S is the most prominent of these. Announced in 2023, it is positioned as a universal accelerator for both AI and graphics, combining strong FP8 inference throughput with the full Ada media and ray tracing feature set. The L40 emphasizes neural graphics, virtualization, and rendering, while the compact, low-power L4 targets high-density video and inference deployments. The RTX 6000 Ada Generation is the corresponding workstation card. Notably, none of the Ada data center parts support NVLink; they communicate over PCI Express Gen 4, which reinforces their positioning toward single-GPU inference and visualization rather than the tightly coupled multi-GPU training clusters served by Hopper.
| Product | Die | Memory | Memory bandwidth | FP32 (TFLOPS) | Power | Form factor / target |
|---|---|---|---|---|---|---|
| L40S | AD102 | 48 GB GDDR6 ECC | 864 GB/s | 91.6 | 350 W | Dual-slot; universal AI and graphics |
| L40 | AD102 | 48 GB GDDR6 ECC | 864 GB/s | ~90 | 300 W | Dual-slot; neural graphics, virtualization |
| L4 | AD104 | 24 GB GDDR6 | 300 GB/s | 30.3 | 72 W | Single-slot, low-profile; video and inference |
| RTX 6000 Ada | AD102 | 48 GB GDDR6 ECC | 960 GB/s | 91.1 | 300 W | Workstation graphics and AI |
The L40S provides 18,176 CUDA cores, 142 third-generation RT cores, and 568 fourth-generation Tensor Cores, with FP8 Tensor throughput of about 733 teraFLOPS (rising to roughly 1,466 teraFLOPS with sparsity).[4] Its NVENC and NVDEC media engines support AV1 encode and decode, suiting it to large-scale video pipelines as well as model inference. The L4, by contrast, draws only 72 W in a single-slot, low-profile, passively cooled card, making it well suited to dense server deployments for inference and video transcoding.[8] The RTX 6000 Ada Generation shares the L40-class configuration of 18,176 CUDA cores and 568 Tensor Cores with 48 GB of ECC GDDR6, but is packaged as an active-cooled workstation graphics card.[9]
All Ada Lovelace products use GDDR6 or GDDR6X memory rather than HBM. Desktop GeForce flagships such as the RTX 4090 and RTX 4080 use the faster GDDR6X, while the data center and workstation parts use ECC-protected GDDR6.[1][4] The decision to forgo HBM, combined with the very large L2 cache, keeps board costs and power lower than HBM-based designs and aligns the architecture with graphics and inference rather than the memory-bandwidth-bound training workloads that Hopper addresses with HBM2e and HBM3.
The architecture is manufactured on TSMC's "4N" process, a 5 nm-class node customized for Nvidia. This is distinct from TSMC's standard "N4" process and should not be confused with it.[1][6] The combination of the advanced node and the architectural changes allowed Ada parts to reach substantially higher clock speeds than Ampere while increasing performance per watt.
Ada Lovelace occupies a specific niche in Nvidia's portfolio. On the consumer side it succeeded the GeForce RTX 30 series (Ampere) and the earlier RTX 20 series (Turing), advancing real-time ray tracing and AI-assisted rendering. On the professional and data center side it coexisted with Hopper, the H100-class architecture launched in the same period. The two architectures are complementary: Hopper, with its HBM memory, NVLink and NVSwitch interconnect, and emphasis on FP8 and FP16 throughput at cluster scale, is built for training and serving the largest AI models, whereas Ada Lovelace targets cost-effective and energy-efficient AI inference, neural graphics, virtual workstations, and video processing.
This division of labor made Ada-based accelerators, particularly the L40S and L4, popular choices for generative AI inference, recommendation systems, speech and conversational AI, and video analytics during 2023 and 2024, when demand for inference capacity grew sharply alongside the deployment of large language models. The architecture's successor for both consumer graphics and the inference-oriented professional segment is the Blackwell generation, which Nvidia introduced subsequently.[1]