Nvidia

AI Companies AI Hardware Artificial Intelligence

47 min read

Updated Jul 10, 2026

Suggest edit History Talk

RawGraph

Last edited

Jul 10, 2026

Fact-checked

Jul 10, 2026

Sources

63 citations

Revision

v11 · 9,315 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

NVIDIA
Type	Public company (Nasdaq: NVDA)
Industry	Semiconductors, AI computing
Founded	April 5, 1993
Founders	Jensen Huang, Chris Malachowsky, Curtis Priem
Headquarters	Santa Clara, California, US
Key people	Jensen Huang (CEO), Colette Kress (CFO)
Products	Data center GPUs (Hopper, Blackwell, Rubin), GeForce, DGX systems, CUDA, networking (Mellanox, Spectrum-X)
Revenue	$215.9 billion (FY2026, ended January 2026)
Net income	$120.1 billion (FY2026)
Market cap	Approx. $4.9 trillion (July 2026)
Employees	Approx. 42,000 (early 2026)
Website	nvidia.com

Nvidia Corporation is an American technology company that designs graphics processing units (GPUs) and is the world's dominant supplier of hardware for artificial intelligence, holding an estimated 80 to 90% of the AI accelerator market as of 2026. On October 29, 2025, Nvidia became the first company in history to reach a $5 trillion market capitalization, a milestone reached less than four months after it first crossed $4 trillion in July 2025.^[17]^[27] Founded on April 5, 1993, by Jensen Huang, Chris Malachowsky, and Curtis Priem, the company is headquartered in Santa Clara, California, and employed approximately 42,000 people as of early 2026.

Originally a graphics chip company serving the video game industry, Nvidia designs GPUs, systems on chips (SoCs), and related software for gaming, professional visualization, data centers, and automotive markets, and has become the dominant supplier of hardware for AI training and inference. Its rapid growth has been driven almost entirely by the explosive demand for AI computing infrastructure from hyperscale cloud providers, AI research labs, and enterprises adopting large language models and generative AI. For fiscal 2026 (ending January 2026), Nvidia reported revenue of $215.9 billion, up 65% year over year, of which a record $193.7 billion came from its Data Center segment.^[4] In May 2026 the stock became the first ever to reach a $5.5 trillion market value.^[28]^[58]

What does Nvidia make?

Nvidia's core product is the GPU, a processor containing thousands of small cores that perform the same operation on many data points at once. This massively parallel design makes GPUs far faster than CPUs for the matrix multiplications at the heart of deep learning. The company sells three broad categories of products: data center accelerators (the Hopper H100, Blackwell B200, and Rubin generations) used to train and run AI models; GeForce consumer GPUs for gaming; and full systems, software, and networking that turn individual chips into large-scale AI infrastructure. Around this hardware Nvidia has built the CUDA software platform, which has become the default programming environment for AI and a significant competitive moat.

Why is Nvidia important for AI?

Nvidia matters to AI because nearly every frontier model since 2012 has been trained on its GPUs, and because its CUDA software stack is the platform most AI researchers and engineers already know. A modern frontier model is typically trained on a cluster of thousands of Nvidia GPUs running for weeks; the resulting near-monopoly on AI training hardware (an estimated 80 to 90% share) is the reason Nvidia became the most valuable company in the world. Demand has so far outstripped supply that in the company's Q3 FY2026 earnings call (November 2025), CEO Jensen Huang stated that "Blackwell sales are off the charts, and cloud GPUs are sold out."^[29]

Company History

When was Nvidia founded?

Nvidia was founded on April 5, 1993, by Jensen Huang, Chris Malachowsky, and Curtis Priem. The three co-founders famously planned the company during a meeting at a Denny's restaurant on Berryessa Road in East San Jose, California, in late 1992. They began working out of Priem's townhouse in Fremont, California, with $40,000 in initial capital.

Jensen Huang, a Taiwanese-American electrical engineer, had previously worked as a microprocessor designer at AMD and as director of CoreWare at LSI Logic. He has served as president and CEO of Nvidia since the company's founding. Malachowsky came from Sun Microsystems, and Priem had worked at both Sun Microsystems and IBM.

The company's name is derived from the Latin word "invidia," meaning envy. In its first two years, Nvidia developed the NV1 multimedia accelerator, which was released in 1995. The chip was not commercially successful, but the lessons learned from its development shaped the company's future direction.

The GeForce Era and the Invention of the GPU

In 1999, Nvidia released the GeForce 256, which it marketed as "the world's first GPU" (graphics processing unit). While earlier graphics accelerators existed, the GeForce 256 was the first consumer chip to integrate transform and lighting calculations on the GPU itself, offloading these tasks from the CPU. This product established Nvidia as a leader in consumer graphics and defined the GPU as a product category.

Throughout the 2000s, Nvidia dominated the gaming GPU market alongside rival ATI (later acquired by AMD). The company expanded into professional visualization with its Quadro product line and into high-performance computing with the Tesla line of compute accelerators.

Going Public

Nvidia went public on January 22, 1999, listing on the Nasdaq stock exchange under the ticker symbol NVDA. The company's initial market capitalization was approximately $563 million.

GPU Computing Revolution

From Graphics to General-Purpose Computing

GPUs were originally designed for a single task: rendering pixels on a screen for video games and graphics applications. However, researchers in the early 2000s recognized that the massively parallel architecture of GPUs could be applied to other computationally intensive problems. A GPU contains thousands of small cores that can execute the same operation on many data points simultaneously, making it well suited for tasks like matrix multiplication, physics simulations, and scientific computing.

This approach, known as General-Purpose computing on Graphics Processing Units (GPGPU), initially required developers to "trick" the GPU by reformulating their computations as graphics rendering tasks. The process was cumbersome and error-prone, which limited adoption.

What is CUDA?

CUDA (Compute Unified Device Architecture) is the parallel computing platform and programming model Nvidia released in 2006 that lets developers write general-purpose programs for Nvidia GPUs using extensions to the C programming language. CUDA eliminated the need to express computations as graphics shaders and provided a straightforward way to harness the parallel processing power of GPUs. It is widely regarded as the foundation of Nvidia's competitive advantage in AI. Speaking at GTC 2025, Jensen Huang said, "Since 2006, six million developers in over 200 countries have used CUDA, and transformed computing."^[48]

CUDA's release was a turning point. For the first time, scientists, engineers, and researchers could write GPU-accelerated code without expertise in graphics programming. Nvidia invested heavily in developer tools, documentation, and university outreach programs to build the CUDA ecosystem.

The impact on deep learning became clear in 2012, when Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton trained AlexNet, a convolutional neural network that won the ImageNet competition by a large margin. AlexNet was trained on two Nvidia GeForce GTX 580 GPUs. This result demonstrated that GPUs were dramatically faster than CPUs for training neural networks, and it sparked the modern deep learning revolution.

As deep learning frameworks like TensorFlow (2015) and PyTorch (2016) emerged, Nvidia worked closely with framework developers to optimize performance on its hardware. The company built specialized libraries such as cuDNN (CUDA Deep Neural Network library) for accelerating neural network primitives and cuBLAS for linear algebra. These libraries became deeply integrated into every major AI framework, creating a powerful software moat that competitors have struggled to replicate.

By 2020, CUDA had become the default compute backend for virtually all serious AI research and production workloads. The combination of mature libraries, extensive documentation, a large developer community, and years of optimization meant that switching to an alternative platform involved significant friction, even when competitive hardware was available.

AI Hardware: GPU Architecture Generations

Nvidia has released a series of GPU architectures, each one bringing major improvements in AI training and inference performance. The company has maintained a roughly two-year cadence for new data center GPU architectures, but in 2024 it announced a shift to an annual cadence covering both GPUs and full platforms.

Tesla (2007)

The Tesla architecture, launched in 2007, was Nvidia's first GPU family designed specifically for general-purpose computing. The Tesla C870 and subsequent models were marketed toward high-performance computing (HPC) and scientific research rather than gaming. Tesla GPUs supported CUDA and offered double-precision floating point, making them suitable for computational physics and molecular dynamics.

Fermi (2010)

The Fermi architecture improved upon Tesla with better double-precision performance and support for error-correcting code (ECC) memory, which was important for scientific applications. Fermi also introduced a unified address space and support for C++ in CUDA programs.

Kepler (2012) and Maxwell (2014)

Kepler introduced dynamic parallelism and Hyper-Q technology, allowing the GPU to manage workloads more efficiently. Maxwell focused on energy efficiency and delivered a significant performance-per-watt improvement.

Pascal (2016)

The Pascal architecture, realized in the Tesla P100 accelerator, was Nvidia's first data center GPU built on the 16nm FinFET process. The P100 featured 3,584 CUDA cores, 16 GB of HBM2 memory, and up to 720 GB/s of memory bandwidth. Pascal also introduced NVLink, a high-speed interconnect for GPU-to-GPU communication that was faster than PCIe.

Volta (2017)

Volta was a landmark architecture for AI. The Tesla V100 introduced Tensor Cores, specialized hardware units designed to accelerate matrix multiply-and-accumulate operations that are central to deep learning. The V100 featured 5,120 CUDA cores, 640 Tensor Cores, 16 or 32 GB of HBM2 memory, 900 GB/s of memory bandwidth, and approximately 21.1 billion transistors fabricated on a 12nm process.

Tensor Cores enabled mixed-precision training, where computations are performed in FP16 (half precision) while maintaining FP32 (single precision) accuracy for accumulation. This approach roughly doubled training throughput compared to FP32-only execution, with minimal impact on model quality.

Turing (2018)

Although primarily a gaming architecture (GeForce RTX 20 series), Turing introduced second-generation Tensor Cores and RT cores for ray tracing. The data center variant, the T4, became widely used for inference workloads due to its low power consumption and INT8 support.

Ampere (2020)

The Ampere architecture, embodied in the A100 accelerator, brought third-generation Tensor Cores with support for additional data types including TF32 (TensorFloat-32), BF16 (bfloat16), and FP64 Tensor Core operations. The A100 was built on a 7nm process with 54 billion transistors, 6,912 CUDA cores, 432 Tensor Cores, and 40 or 80 GB of HBM2e memory providing up to 2 TB/s of bandwidth.

The A100 also introduced Multi-Instance GPU (MIG) technology, allowing a single GPU to be partitioned into up to seven independent instances for running multiple workloads concurrently. The A100's third-generation NVLink provided 600 GB/s of GPU-to-GPU bandwidth.

Hopper (2022)

The Hopper architecture, named after computer scientist Grace Hopper, produced the H100 accelerator. Built on a 4nm process with approximately 80 billion transistors, the H100 featured 16,896 CUDA cores, 528 Tensor Cores, and 80 GB of HBM3 memory delivering 3.35 TB/s of bandwidth.

The H100 introduced the Transformer Engine, a hardware feature that automatically manages mixed-precision computation between FP8 and FP16 formats on a layer-by-layer basis. This was specifically designed to accelerate transformer architectures, which underpin modern LLMs. The H100 SXM variant delivered approximately 67 TFLOPS of FP32 performance and up to 1,979 TFLOPS of FP16 Tensor performance.

The H100 became the most sought-after chip in the AI industry during 2023 and 2024. Wait times stretched to months, and the chip traded on secondary markets at significant premiums above list price.

Nvidia later released the H200, an updated version with 141 GB of HBM3e memory and 4.8 TB/s of bandwidth, offering substantially improved performance for large model inference due to the increased memory capacity and bandwidth.

Blackwell (2024-2025)

The Blackwell architecture, named after mathematician David Blackwell, represented another major leap. Blackwell GPUs use a novel dual-die design in which two GPU dies are connected by a high-bandwidth on-chip link and function as a single unified GPU.

The B200 accelerator features approximately 208 billion transistors (104 billion per die), 20,480 CUDA cores, 192 GB of HBM3e memory, and 8 TB/s of memory bandwidth. The B200 introduced fifth-generation Tensor Cores with native FP4 (4-bit floating point) support for inference and second-generation Transformer Engine support.

Nvidia also released the B300 (Blackwell Ultra) variant with 288 GB of HBM3e memory and enhanced compute capabilities, delivering up to 15,000 TFLOPS in FP4 Tensor operations.

The GB200 NVL72 is a rack-scale system that combines 72 Blackwell GPUs and 36 Grace CPUs connected via fifth-generation NVLink. This configuration delivers up to 1,440 petaflops of FP4 inference performance and is designed for training and running trillion-parameter models.

Blackwell GPUs began volume production in late 2024 and ramped aggressively through 2025, with every major cloud provider deploying them at scale. In Nvidia's Q3 FY2026 earnings call (November 2025), CEO Jensen Huang stated that "Blackwell sales are off the charts, and cloud GPUs are sold out."^[29] On the same call, CFO Colette Kress told analysts the company had "visibility to a half a trillion dollars in Blackwell and Rubin revenue from the start of this year through the end of calendar year 2026," underscoring the scale of the order book.^[49] At GTC 2026 in March 2026, Huang raised that visibility further, projecting roughly $1 trillion in cumulative Blackwell and Rubin purchase orders through 2027.^[63]

Rubin (2026)

At GTC 2026, held March 16 to 19 at the SAP Center in San Jose, Nvidia unveiled the Vera Rubin platform, the company's next-generation architecture. The platform pairs the new Rubin GPU with a custom Arm-based CPU called Vera, named after astronomer Vera Rubin.^[30]^[31]

The Rubin GPU uses two reticle-sized dies and is targeted at delivering up to 50 PFLOPS of NVFP4 (FP4) compute, 288 GB of HBM4 memory, and 22 TB/s of memory bandwidth. The Vera CPU contains 88 custom Arm v9-class cores with 2-way simultaneous multithreading (176 threads) and 1.8 TB/s of NVLink-C2C bandwidth to its companion GPU, approximately double the bandwidth of the prior Grace CPU.^[31]

A full Vera Rubin NVL144 rack integrates 144 Rubin GPU dies (in 72 packages) with 36 Vera CPUs and delivers up to 3.6 NVFP4 exaflops of inference performance and approximately 1.2 FP8 exaflops of training performance, according to Nvidia's specifications.^[31] Volume production of Vera Rubin is targeted for the second half of 2026. Rubin Ultra is planned for 2027, and a successor architecture, codenamed Feynman, has been previewed for 2028.

In September 2025, Nvidia separately introduced the Rubin CPX, described as a new class of GPU purpose-built for "massive-context" inference workloads such as million-token coding and generative video. Rubin CPX combines a monolithic die with 128 GB of GDDR7 memory and is rated at 30 NVFP4 petaflops, with integrated video encode and decode acceleration; it is intended to be paired with standard Rubin GPUs in a disaggregated inference architecture and is expected to ship at the end of 2026.^[32]

Across the Rubin launch Nvidia reframed the data center itself as an "AI factory." In his GTC 2026 keynote, Huang declared that "tokens are the new commodity," describing modern AI data centers as industrial systems built to convert electricity into the tokens that AI models generate.^[50]

Data Center GPU Comparison

The following table summarizes key specifications of Nvidia's major data center GPU accelerators used for AI workloads.

GPU	Architecture	Year	Process	CUDA Cores	Memory	Memory Type	Bandwidth (TB/s)	FP32 (TFLOPS)	FP16 Tensor (TFLOPS)	TDP (W)	Transistors (B)
Tesla P100	Pascal	2016	16nm	3,584	16 GB	HBM2	0.72	10.6	N/A	300	15.3
Tesla V100	Volta	2017	12nm	5,120	32 GB	HBM2	0.90	15.7	125	300	21.1
A100 (SXM)	Ampere	2020	7nm	6,912	80 GB	HBM2e	2.0	19.5	312	400	54.2
H100 (SXM)	Hopper	2022	4nm	16,896	80 GB	HBM3	3.35	67	1,979	700	80
H200	Hopper	2024	4nm	16,896	141 GB	HBM3e	4.8	67	1,979	700	80
B200	Blackwell	2025	4nm	20,480	192 GB	HBM3e	8.0	N/A	2,250 (FP16)	1,000	208
B300	Blackwell Ultra	2025	4nm	20,480	288 GB	HBM3e	8.0	N/A	~2,500 (FP16)	1,400	208
Rubin CPX	Rubin	2026	3nm	TBD	128 GB	GDDR7	TBD	TBD	30 PFLOPS (NVFP4)	TBD	TBD
Rubin (VR200)	Rubin	2026	3nm	TBD	288 GB	HBM4	22.0	TBD	50 PFLOPS (NVFP4)	TBD	TBD

Data Center GPUs vs. Consumer GPUs for AI

Nvidia sells GPUs through two distinct product lines: data center accelerators (A100, H100, H200, B200) designed for professional AI workloads, and GeForce consumer GPUs (RTX 4090, RTX 5090) designed primarily for gaming.

Although consumer GPUs can be used for AI tasks, there are important differences.

Feature	Data Center GPUs (e.g., H100)	Consumer GPUs (e.g., RTX 5090)
Memory capacity	80 to 288 GB (HBM)	24 to 32 GB (GDDR)
Memory bandwidth	3.35 to 8+ TB/s	1.0 to 1.8 TB/s
Tensor Core support	Full precision range (FP64, FP32, FP16, FP8, FP4)	Limited precision support
Multi-GPU interconnect	NVLink (up to 1.8 TB/s)	PCIe only (~128 GB/s)
ECC memory	Yes	No
MIG support	Yes (A100, H100)	No
Price	$25,000 to $40,000+	$1,800 to $2,600
Typical use case	Large-scale training, enterprise inference	Fine-tuning, small model inference, research

The limited memory capacity of consumer GPUs is the primary bottleneck for AI workloads. A single RTX 5090 with 32 GB of GDDR7 cannot hold the parameters of models larger than about 15 billion parameters at full precision, whereas an H200 with 141 GB of HBM3e can handle much larger models. The lack of NVLink on consumer GPUs also makes multi-GPU training significantly less efficient, as GPUs must communicate over the much slower PCIe bus.

That said, consumer GPUs offer strong price-to-performance ratios for smaller workloads. The RTX 5090 delivers roughly 30 to 45% better deep learning performance than the RTX 4090, and research teams on tight budgets sometimes build multi-GPU workstations with consumer cards for experimentation and fine-tuning.

DGX Systems and AI Supercomputers

Nvidia's DGX product line provides turnkey AI computing systems that bundle multiple GPUs with optimized networking, storage, and software.

DGX-1 (2016)

The DGX-1, announced in April 2016, was marketed as "the world's first deep learning supercomputer." It contained eight Tesla P100 GPUs connected via NVLink and was designed to deliver the computational equivalent of approximately 250 conventional servers. Jensen Huang personally delivered the first DGX-1 unit to the OpenAI research lab.

DGX-2 (2018)

The DGX-2 doubled the GPU count to 16 Tesla V100 GPUs connected through NVSwitch, delivering 2 petaflops of deep learning performance in a single system.

DGX A100 (2020)

The DGX A100 featured eight A100 GPUs, 15 TB of NVMe storage, 1 TB of system RAM, and eight 200 Gb/s InfiniBand ConnectX-6 network interfaces, providing 5 petaflops of AI performance (FP16 Tensor Core operations with structured sparsity).

DGX H100 (2022)

The DGX H100 contained eight H100 GPUs delivering 32 petaflops of FP8 AI compute, 640 GB of total HBM3 memory, and fourth-generation NVLink for GPU-to-GPU communication at 900 GB/s per GPU.

DGX B200 and DGX B300 (2024-2025)

The DGX B200 features eight Blackwell B200 GPUs delivering 72 petaflops of FP8 training performance and 144 petaflops of FP4 inference performance.^[62] Nvidia claims the DGX B200 provides 3x faster training and 15x faster inference on large Mixture-of-Experts models compared to the DGX H100. The DGX B300, released in 2025, is based on the Blackwell Ultra B300 GPU and is the flagship of the current generation.

DGX SuperPOD

The DGX SuperPOD is a large-scale cluster configuration that combines multiple DGX systems with high-bandwidth networking and shared storage. SuperPODs scale from dozens to thousands of GPUs and are designed for training frontier AI models. Multiple organizations, including Meta, Microsoft, and various national laboratories, have deployed DGX SuperPOD configurations.

Project DIGITS / DGX Spark (2025)

Announced as "Project DIGITS" at CES 2025 in January and rebranded as DGX Spark at GTC in March 2025, the system is a desktop-sized AI computer built around the new GB10 Grace Blackwell Superchip. The first generation provided 128 GB of unified memory and approximately 1,000 sparse FP4 TOPS of compute. It is sold through system builders including ASUS, Dell, HP, and Lenovo and is targeted at researchers and developers who want to prototype and fine-tune models locally before deploying to cloud-based infrastructure.^[33]

A larger sibling, DGX Station, was unveiled alongside Spark at GTC 2025; it is built on the Blackwell Ultra platform and packaged in a workstation form factor for solo developers and small teams.

Software Ecosystem

Nvidia's competitive advantage extends well beyond hardware. The company has built a comprehensive software ecosystem that spans the entire AI development pipeline, from data preparation through model training to production inference.

CUDA

CUDA is the foundational layer of Nvidia's software stack. Released in 2006, it provides a parallel computing platform and programming model that allows developers to use Nvidia GPUs for general-purpose computation. CUDA includes a compiler (nvcc), runtime libraries, debugging tools, and profiling utilities. As of 2025, Nvidia reported that more than 6 million developers in over 200 countries had used CUDA, and the company maintains over 900 GPU-accelerated libraries, SDKs, and applications built on the platform.^[48]

cuDNN

cuDNN (CUDA Deep Neural Network library) provides highly optimized implementations of common neural network operations such as convolution, pooling, normalization, and activation functions. Every major deep learning framework, including PyTorch, TensorFlow, and JAX, relies on cuDNN for GPU-accelerated training and inference.

TensorRT

TensorRT is a high-performance inference optimization SDK. It takes trained neural network models and applies graph optimizations, layer fusion, kernel auto-tuning, precision calibration (FP16, INT8, FP8), and other techniques to maximize inference throughput and minimize latency. TensorRT can speed up inference by up to 6x compared to running the same model in a standard framework. TensorRT-LLM is a specialized version designed for optimizing and serving large language models.

Triton Inference Server

Triton Inference Server is an open-source inference serving platform that supports models from multiple frameworks (PyTorch, TensorFlow, ONNX, TensorRT) and can run on both GPUs and CPUs. Triton handles model versioning, dynamic batching, ensemble pipelines, and provides HTTP/gRPC endpoints for serving predictions at scale. It has become widely adopted for production AI deployments.

NeMo

Nvidia NeMo is an end-to-end framework for building, customizing, and deploying large language models and conversational AI systems. NeMo provides tools for data curation, supervised fine-tuning, reinforcement learning from human feedback (RLHF), and model alignment. It integrates with Nvidia's hardware optimizations and supports distributed training across large GPU clusters.

NVIDIA AI Enterprise and NIM Microservices

NVIDIA AI Enterprise is a supported software platform that bundles CUDA, cuDNN, TensorRT, Triton, NeMo, and a suite of pre-built tools with enterprise-grade support contracts. A major component is NIM (NVIDIA Inference Microservices), introduced at GTC 2024 and expanded through 2025-2026: each NIM packages a model (LLM, vision, speech, or domain-specific) together with an optimized inference engine and OpenAI-compatible API endpoints, deployable as a container on any CUDA-capable infrastructure. NIM is sold per-GPU-per-year as part of the AI Enterprise subscription.

RAPIDS

RAPIDS is a suite of open-source GPU-accelerated libraries for data science and analytics. It includes cuDF (a GPU DataFrame library similar to pandas), cuML (GPU-accelerated machine learning algorithms), and cuGraph (graph analytics). RAPIDS allows data scientists to accelerate their existing workflows by moving computation from CPUs to GPUs with minimal code changes.

Additional Software Components

Software	Purpose
NCCL	Multi-GPU and multi-node collective communication library
cuBLAS	GPU-accelerated basic linear algebra
Nsight Systems	System-wide performance profiling
DALI	GPU-accelerated data loading and preprocessing pipeline
Magnum IO	Optimized I/O for data center workloads
Run:ai	GPU orchestration and workload management (acquired 2024)

Nvidia in the AI Training Pipeline

Training a modern large language model requires three primary resources: data, algorithms, and compute. Nvidia GPUs are central to the compute component. The training process involves repeatedly performing forward passes (computing predictions) and backward passes (computing gradients and updating model weights) over massive datasets. These operations are dominated by matrix multiplications, which map efficiently onto the parallel architecture of GPUs.

A typical large-scale training run for a frontier model uses thousands of GPUs working in parallel. For example, training a model with hundreds of billions of parameters might use a cluster of 8,000 to 32,000 H100 GPUs running continuously for several weeks. The GPUs communicate gradient updates using high-speed NVLink and InfiniBand networking.

Nvidia's hardware and software together address several bottlenecks in this pipeline.

Compute throughput: Tensor Cores accelerate the matrix operations that consume most of the training time. Mixed-precision training (using FP16 or FP8 instead of FP32) further increases throughput.
Memory capacity: HBM (High Bandwidth Memory) allows large model layers and activation tensors to reside on-chip, reducing the need for slow data transfers between GPU memory and system memory.
Interconnect bandwidth: NVLink and NVSwitch provide high-bandwidth, low-latency communication between GPUs within a node, while InfiniBand handles inter-node communication.
Software optimization: Libraries like cuDNN, NCCL, and TensorRT are continuously tuned for each new hardware generation, ensuring that training frameworks extract maximum performance from the hardware.

Networking: Spectrum-X, Quantum-X, and BlueField

Following the 2020 acquisition of Mellanox Technologies, Nvidia has built one of the largest networking businesses in the data center industry; networking revenue reached a record $11.0 billion in Q4 FY2026 alone, up 263% year over year.^[4]

Spectrum-X is Nvidia's Ethernet platform tuned for AI workloads, combining Spectrum-4 (and successor Spectrum-6) ASICs with BlueField-3 SuperNICs. At GTC 2025, Nvidia announced Spectrum-X Photonics and Quantum-X Photonics silicon-photonics switches that integrate the optical engine directly into the switch package. Configurations include up to 512 ports of 800 Gb/s Ethernet (400 Tb/s total throughput) and 144 ports of 800 Gb/s InfiniBand. Nvidia states the photonics designs use roughly 4x fewer lasers and deliver about 3.5x better power efficiency than equivalent pluggable-optics deployments.^[34]

Spectrum-XGS Ethernet, introduced in 2025, is a "scale-across" technology designed to link geographically distributed data centers into a unified, giga-scale AI super-factory. Meta and Oracle were named as early Spectrum-X Ethernet customers.^[35]

BlueField-3 DPUs continue to ship in volume; BlueField-4 STX storage processors and Spectrum-6 SPX Ethernet switches were announced at GTC 2026 as components of the Vera Rubin rack-scale platform.^[30]

Market Position

Nvidia holds a near-monopoly on the AI training hardware market. Estimates from industry analysts in 2026 placed Nvidia's share of the AI accelerator market at approximately 80 to 90%, with the remaining share split among AMD, Google, Intel, Amazon, and various startups.

Several factors contribute to this dominance.

Software ecosystem lock-in: The CUDA ecosystem has been developing for nearly two decades. Most AI researchers, framework developers, and MLOps engineers have deep expertise in CUDA-based tools. Switching to a different hardware platform means rewriting or adapting code, revalidating model behavior, and retraining operations teams.

Performance leadership: Each generation of Nvidia GPUs has delivered substantial performance improvements over the previous generation and over competing offerings. While competitors have occasionally matched Nvidia on paper specifications, real-world training performance (which depends heavily on software optimization) has consistently favored Nvidia.

Supply chain and manufacturing: Nvidia has secured priority access to advanced manufacturing capacity at TSMC and to HBM supply from Samsung and SK Hynix, giving it the ability to ship large volumes of cutting-edge chips.

Full-stack integration: By providing hardware, interconnects (NVLink, NVSwitch), networking (acquired Mellanox in 2020 for $6.9 billion), and software in a single optimized stack, Nvidia reduces the integration burden for customers.

Financial Growth

Nvidia's revenue growth over the past several years illustrates the scale of the AI computing boom.

Fiscal Year (ending January)	Total Revenue	Data Center Revenue	YoY Revenue Growth
FY2022	$26.9B	$10.6B	61%
FY2023	$27.0B	$15.0B	0.2%
FY2024	$60.9B	$47.5B	126%
FY2025	$130.5B	~$115B	114%
FY2026	$215.9B	$193.7B	65%

The data center segment has become the overwhelming driver of Nvidia's business, growing from about 40% of total revenue in FY2022 to approximately 90% in FY2026. Quarterly revenue accelerated throughout FY2026 and into FY2027:

Quarter	Revenue	Data Center	YoY
Q1 FY2026 (Apr 2025)	$44.1B	$39.1B	69%
Q2 FY2026 (Jul 2025)	$46.7B	$41.1B	56%
Q3 FY2026 (Oct 2025)	$57.0B	$51.2B	62%
Q4 FY2026 (Jan 2026)	$68.1B	$62.3B	73%
Q1 FY2027 (Apr 2026)	$81.6B	$75.2B	85%

Within Q4 FY2026, networking hardware alone contributed $10.98 billion (up 263% year-over-year) while compute GPUs contributed $51.3 billion.^[4] On the Q3 FY2026 earnings call, CFO Colette Kress quantified the forward demand directly, telling analysts the company had "visibility to a half a trillion dollars in Blackwell and Rubin revenue from the start of this year through the end of calendar year 2026."^[49] At GTC 2026 in March, Huang doubled that order-book figure, saying he expected roughly $1 trillion in combined Blackwell and Rubin purchase orders through 2027.^[63]

For Q1 FY2027 (ended April 26, 2026), Nvidia had guided to revenue of $78.0 billion plus or minus 2%, explicitly noting that the outlook assumed no Data Center compute revenue from China.^[4] The actual results, reported on May 20, 2026, comfortably beat that guide: record revenue of $81.6 billion, up 20% sequentially and 85% year over year, with record Data Center revenue of $75.2 billion, up 92% year over year.^[36]^[51] Alongside the results, Nvidia raised its quarterly cash dividend from $0.01 to $0.25 per share and authorized an additional $80 billion in share repurchases.^[51]

Nvidia's gross margins have remained exceptionally high for a semiconductor company. FY2026 non-GAAP gross margin was 71.3%, down from the FY2025 peak of 75.5% as Blackwell ramped through a more complex bring-up cycle and the H20 charge weighed on the first quarter; non-GAAP gross margins were 73.6% in Q3 FY2026 alone, recovered to 75.2% in Q4 FY2026, and held at 75.0% in Q1 FY2027.^[4]^[29]^[5]^[51]

Stock Price and Market Capitalization

When did Nvidia reach $1 trillion, and when did it hit $5 trillion?

Nvidia's stock price has experienced extraordinary growth, driven by the AI boom. It first crossed a $1 trillion market capitalization in May 2023, $2 trillion in February 2024, $3 trillion in June 2024, and $4 trillion in July 2025, before becoming the first company in history to close above $5 trillion on October 29, 2025.^[17]^[27] On May 13, 2026, it became the first company to reach a $5.5 trillion market capitalization; the stock subsequently pulled back, and Nvidia's market value stood at roughly $4.9 trillion in early July 2026.^[58]^[16]

Date	Milestone
January 1999	IPO on Nasdaq; market cap ~$563 million
May 2023	Market cap crosses $1 trillion
February 2024	Market cap crosses $2 trillion
June 2024	10-for-1 stock split; market cap crosses $3 trillion
January 2025	Single-day loss of ~$600 billion following the DeepSeek R1 announcement
July 2025	Market cap briefly touches $4 trillion
October 29, 2025	First company in history to close above $5 trillion market cap
May 13, 2026	First company to reach a $5.5 trillion market cap^[28]^[58]

Since its IPO, Nvidia's market capitalization has increased by approximately 1,000,000% in nominal terms. The stock (NVDA) underwent a 10-for-1 stock split in June 2024 (its sixth split overall).

Competition

Who are Nvidia's main competitors?

Although Nvidia dominates the AI accelerator market, several competitors are working to challenge its position, chiefly AMD's Instinct GPUs, Google's TPUs, Amazon's Trainium chips, and a wave of custom silicon from cloud providers.

AMD

AMD is Nvidia's most direct competitor in the GPU market. AMD's Instinct MI300X, launched in late 2023, offered 192 GB of HBM3 memory and competitive inference performance. The MI325X and MI355X followed in 2024-2025. At CES 2026, AMD unveiled the MI400 series, with 432 GB of HBM4 memory targeting Nvidia's Rubin generation.

AMD has secured deployment commitments from Microsoft, Meta, and OpenAI. However, AMD's ROCm software ecosystem, while improving, is still considered less mature than Nvidia's CUDA stack, and many AI workloads do not yet run as efficiently on AMD hardware.

Google TPUs

Google has developed its own Tensor Processing Units (TPUs) since 2015. TPUs are custom ASICs designed specifically for tensor operations. Google uses TPUs extensively for internal AI workloads and offers them to external customers through Google Cloud. The seventh-generation TPU, Ironwood, released in late 2025, delivers 4,614 TFLOPS per chip, which analysts have described as being on par with Blackwell in some workloads.

TPUs are tightly integrated with Google's JAX and TensorFlow frameworks. However, they are only available through Google Cloud, limiting their reach compared to Nvidia GPUs, which can be purchased outright or rented from any cloud provider.

Amazon Trainium

Amazon Web Services (AWS) developed the Trainium series of custom AI training chips. Trainium2, launched in 2024, is used by Anthropic to train its Claude models, with deployments reportedly exceeding 500,000 chips. AWS launched Trainium3 in December 2025 with 2.52 petaflops of FP8 compute and 144 GB of HBM3e memory. Amazon's approach is to offer Trainium as a lower-cost alternative to Nvidia GPUs within its cloud ecosystem.

Intel Gaudi

Intel entered the AI accelerator market through its acquisition of Habana Labs in 2019. The Gaudi line of AI accelerators was positioned as a cost-effective alternative to Nvidia's offerings. However, Intel confirmed plans to discontinue the Gaudi line when its next-generation GPU architecture launches in 2026 or 2027, signaling a strategic pivot.

Competitive Summary

Competitor	Product Line	Key Advantage	Key Limitation
AMD	Instinct MI series	High memory capacity; competitive pricing	ROCm ecosystem less mature than CUDA
Google	TPU (Ironwood)	Tight JAX/TF integration; strong for training	Only available on Google Cloud
Amazon	Trainium	Lower cost on AWS; good for inference	Limited to AWS; less community support
Intel	Gaudi (discontinued)	Budget-friendly	Being phased out
Custom silicon (various)	Microsoft Maia, Meta MTIA	Optimized for specific workloads	Not available to general market

Despite growing competition, Nvidia's combination of hardware performance, software maturity, and ecosystem breadth has maintained its dominant position. Custom ASIC shipments from cloud providers are projected to grow 44.6% in 2026, compared to 16.1% growth for GPU shipments, indicating that the competitive dynamics are slowly shifting.

Cloud Provider and Customer Partnerships

Nvidia GPUs are available through all major cloud platforms, and the company has established deep partnerships with each of the leading providers.

Amazon Web Services

Nvidia and AWS have a partnership spanning over 15 years. AWS offers Nvidia GPU instances across multiple GPU generations, including P4d (A100), P5 (H100), and P6 (Blackwell) instances. In 2025, Nvidia launched DGX Cloud on AWS, a fully managed AI training platform. AWS has committed to deploying more than one million Nvidia GPUs, including Blackwell and Rubin architectures, across global cloud regions starting in 2026.

Microsoft Azure

Microsoft Azure offers extensive Nvidia GPU availability, including NC, ND, and NV series virtual machines. Azure has deployed large-scale clusters using Nvidia GB300 NVL72 systems for training frontier AI models, including those built by OpenAI. Microsoft has also used Nvidia RTX PRO 6000 Blackwell GPUs for Azure workloads and has announced plans for further expansion.

Google Cloud Platform

While Google also develops its own TPUs, Google Cloud offers Nvidia GPU instances (A100, H100, and Blackwell-based) for customers who prefer or require Nvidia hardware. Nvidia DGX Cloud is also available on Google Cloud.

CoreWeave

CoreWeave, a GPU-focused cloud provider that went public on the Nasdaq on March 28, 2025 (raising $1.5 billion), has emerged as Nvidia's most strategically intertwined cloud customer. In January 2026, Nvidia invested an additional $2 billion in CoreWeave at $87.20 per share to support CoreWeave's planned buildout of more than 5 gigawatts of AI factory capacity by 2030. As of early 2026, Nvidia held approximately a 13% stake in CoreWeave, up from 7% at IPO.^[37] An expanded master services agreement signed in fall 2025 commits Nvidia to purchase up to $6.3 billion in unsold CoreWeave capacity through 2032.

Hyperscaler Concentration

These partnerships reflect a mutually dependent relationship. Cloud providers need Nvidia GPUs to attract AI workloads, while Nvidia benefits from the massive purchasing power of hyperscale data centers. Hyperscalers accounted for just over 50% of Nvidia's data center revenue in FY2026. The combined fiscal 2026 AI-related capital expenditure of Microsoft, Amazon, Meta, and Alphabet has been publicly disclosed at roughly $700 billion across all four companies.

OpenAI and Stargate Partnership

In September 2025, Nvidia and OpenAI announced a strategic partnership under which Nvidia intended to invest up to $100 billion in OpenAI progressively, in stages tied to the deployment of at least 10 gigawatts of NVIDIA AI systems for OpenAI training and inference workloads, with the first gigawatt scheduled for the second half of 2026 on the Vera Rubin platform.^[38] The arrangement never became a definitive agreement: in December 2025, Nvidia CFO Colette Kress disclosed that the deal remained at the letter-of-intent stage.^[39] It was superseded in early 2026, when Nvidia instead took a $30 billion direct equity stake in the record OpenAI funding round announced on February 27, 2026, which also included a $50 billion commitment from Amazon and $30 billion from SoftBank.^[52] In March 2026, Jensen Huang played down the chances of the remaining investment contemplated in the original letter of intent, saying the $30 billion stake "might be the last."^[52] The round closed on March 31, 2026 with $122 billion raised at an $852 billion post-money valuation.^[53]

The partnership sits alongside the Stargate Project, a joint venture announced on January 21, 2025 by OpenAI, SoftBank, Oracle, and Abu Dhabi-based MGX. Stargate plans to deploy $500 billion in AI infrastructure over four years in the United States, with $100 billion committed at launch. Stargate's flagship Abilene, Texas site opened in September 2025 running Oracle Cloud Infrastructure on racks of Nvidia GPUs; additional sites have been announced in New Mexico, Ohio, and elsewhere.^[40]

Anthropic Partnership

In November 2025, Nvidia and Microsoft jointly announced a $15 billion combined investment in Anthropic, with Nvidia contributing $10 billion and Microsoft contributing $5 billion at a valuation of approximately $350 billion. As part of the deal, Anthropic committed to purchase $30 billion of Microsoft Azure compute capacity, and to deploy up to 1 gigawatt of NVIDIA Grace Blackwell and Vera Rubin systems. Anthropic and Nvidia will also co-engineer future architectures, optimizing Claude models for Nvidia silicon and tailoring future Nvidia hardware to Anthropic workloads.^[41]

The deal made Claude the only frontier model commercially available on all three major hyperscale clouds (AWS, Azure, Google Cloud) and is widely viewed as Nvidia's hedge against single-customer concentration on OpenAI.

US Export Controls and China

The US government has imposed a series of export restrictions on advanced AI chips to China, significantly affecting Nvidia's business in one of its largest markets.

Timeline of Restrictions

October 2022: The Bureau of Industry and Security (BIS) introduced the first round of semiconductor export controls targeting China. The rules restricted the export of chips above certain performance thresholds, effectively banning sales of the A100 and H100 to Chinese customers.

2023: Nvidia designed and sold the A800 and H800, modified versions of the A100 and H100 with reduced interconnect bandwidth to comply with the initial export controls. In October 2023, BIS broadened the rules to close this loophole, and Nvidia was notified to immediately halt exports of the H800.

2024: Nvidia introduced the H20, a further downgraded chip designed to fall below the revised performance thresholds. Nvidia sold approximately one million H20 chips to Chinese customers in 2024.

January 2025: The outgoing Biden administration issued the "AI Diffusion Rule" (Framework for AI Diffusion), establishing global performance thresholds that blocked sales of flagship GPUs like the H100 and H200 to China while creating a tiered system of export permissions for different countries.

April 2025: The Trump administration tightened controls further, effectively halting all H20 exports to China. Nvidia initially warned of charges of up to approximately $5.5 billion; the actual charge recorded in Q1 FY2026 was $4.5 billion for H20 excess inventory and purchase obligations, alongside $2.5 billion of H20 revenue the company was unable to ship.^[42]^[59]

July 2025: After intensive lobbying by Jensen Huang, including meetings in both Washington and Beijing, the administration reversed course and allowed Nvidia to resume H20 shipments to China and AMD to restart MI308 sales.^[43]

August 2025: Nvidia and AMD agreed to pay the US government 15% of revenue from AI chip sales into China as a condition of the export licenses.^[60] Meanwhile, Chinese regulators and several large Chinese internet companies signaled reluctance to absorb H20 inventory, citing both performance concerns relative to domestic alternatives and political pressure to favor local suppliers.

December 2025: Further adjustments allowed export of the H200 to approved Chinese customers under specific licensing conditions.

For Q1 FY2027, Nvidia's guidance explicitly assumed zero Data Center compute revenue from China, reflecting ongoing uncertainty around the policy regime.^[4] The export controls have prompted Chinese companies to accelerate development of domestic alternatives, including Huawei's Ascend 910C, Cambricon's Siyuan 590, and MetaX's C500.

Nvidia's AI Research

Beyond building hardware and software platforms, Nvidia conducts significant AI research through Nvidia Research and applies AI to several application domains.

Nemotron Language Models

Nvidia has developed the Nemotron family of large language models for enterprise and agentic AI applications. The Nemotron models are released under permissive open licenses and are designed to be customized for specific business use cases through NeMo. The family includes Nemotron 3, Nemotron-CC (a CommonCrawl-based training corpus), the Jet-Nemotron efficient variant, Nemotron Speech (for speech recognition), and Nemotron RAG (for retrieval-augmented generation with embedding and reranking models).

The Artificial Analysis Open Index has rated the Nemotron family among the most open model releases in the AI ecosystem based on license permissibility, data transparency, and technical documentation availability.

Cosmos World Foundation Models

At CES 2025 on January 6, 2025, Nvidia introduced Cosmos, a platform of generative world foundation models (WFMs) for physical AI, including autonomous vehicles and robotics. Cosmos models generate physics-aware video from text, image, and sensor inputs. The first set of Cosmos Predict models was released under an open model license on Hugging Face and the NVIDIA NGC catalog, with early adopters including Agility Robotics, Figure AI, Foretellix, Skild AI, and Uber.^[44] Nvidia stated that the Cosmos data pipeline can process 20 million hours of video in roughly 14 days on Blackwell hardware (versus more than three years on CPU-only systems).

The Cosmos lineup expanded at GTC in March 2025 with the Cosmos Transfer and Cosmos Reason world foundation models, and a further major release in November 2025 updated these model families for synthetic data generation and reasoning-on-pixels workflows.^[61]

GR00T Robotics Platform

Isaac GR00T is Nvidia's platform for AI-powered humanoid robots. GR00T N1, announced at GTC on March 18, 2025, was described by Nvidia as "the world's first open, fully customizable foundation model for generalized humanoid reasoning and skills." It uses a dual-system vision-language-action (VLA) architecture: a language-vision module interprets the environment and instructions while a diffusion-transformer module generates motor actions. Early access partners included Agility Robotics, Boston Dynamics, Mentee Robotics, and NEURA Robotics.^[45]

Subsequent releases through 2025-2026 culminated in GR00T N1.6, a reasoning VLA that enables full-body control for humanoids. Nvidia has also released large open datasets, including hundreds of thousands of robotics trajectories generated in Isaac Sim and Omniverse.

DRIVE Autonomous Vehicles

Nvidia DRIVE is the company's platform for autonomous vehicle development. It includes the DRIVE Orin and DRIVE Thor system-on-chip processors for in-vehicle AI computing, as well as software tools for perception, mapping, and planning. The DRIVE AGX Thor superchip provides up to 2,000 FP8 TFLOPS for production vehicles.

At CES 2026, Nvidia introduced DRIVE Alpamayo-R1 (AR1), an open reasoning VLA model for autonomous-driving research that combines chain-of-thought reasoning with path planning. Mercedes-Benz announced that its new CLA model would be the first production passenger car to ship with the complete NVIDIA DRIVE AV software stack and Alpamayo capabilities; the system was demonstrated driving for 90% of test rides through San Francisco in early 2026. Hyundai Motor Group announced a separate AI factory partnership in November 2025 covering vehicle platforms, smart factories, and robotics. Other DRIVE customers include Toyota, BYD, Lucid, and several Chinese EV manufacturers.

Omniverse and Digital Twins

Nvidia Omniverse is a platform for building and simulating 3D virtual worlds, or "digital twins." It is used in manufacturing, architecture, robotics, and autonomous vehicle simulation. Omniverse is built on the Universal Scene Description (OpenUSD) framework and allows real-time collaboration and physics-accurate simulation. Industrial digital-twin deployments in 2025-2026 included partnerships with Foxconn (for manufacturing-floor optimization) and BMW.

Key Acquisitions

Nvidia has made several acquisitions that strengthened its position in AI and data center computing.

Year	Company	Price	Significance
2019	Cumulus Networks	~$100M	Data center networking software
2020	Mellanox Technologies	$6.9B	High-speed data center networking (InfiniBand)
2020	Arm Ltd. (attempted)	$40B	Failed acquisition of Arm; abandoned in February 2022 after regulatory pushback
2024	Run:ai (closed Dec 30, 2024)	~$700M	GPU orchestration and Kubernetes scheduler for AI workloads
2025	Groq (asset purchase and technology license)	~$20B	Largest deal in Nvidia history; LPU inference technology and key engineering team

The Mellanox acquisition was particularly strategic, as it gave Nvidia control over the InfiniBand networking technology that is used to connect GPUs in large-scale AI training clusters. By owning both the GPU and the network fabric, Nvidia can optimize the entire data path for distributed training workloads.

The attempted acquisition of Arm for $40 billion would have given Nvidia ownership of the CPU architecture used in most mobile devices and an increasing number of data center servers. However, the deal was blocked by regulators in multiple jurisdictions due to competition concerns and was officially terminated in February 2022.

The Run:ai acquisition closed on December 30, 2024 after unconditional approval from the European Commission. Nvidia subsequently committed to open-sourcing the Run:ai platform, which orchestrates GPU resources across Kubernetes clusters and can support non-Nvidia accelerators over time.^[46]

Nvidia's largest deal on record came on December 24, 2025, when it agreed to pay approximately $20 billion for the assets of AI inference chip startup Groq.^[54] The transaction was structured as an asset purchase combined with a non-exclusive licensing agreement covering Groq's language processing unit (LPU) technology rather than a conventional merger, and it brought Groq founder Jonathan Ross, one of the original engineers behind Google's TPU, and Groq president Sunny Madra to Nvidia along with key members of the engineering team.^[54]^[55] The deal produced its first product at GTC 2026, where Nvidia unveiled the Groq 3 LPU, an SRAM-based inference accelerator that attaches to the Vera Rubin platform as a dedicated co-processor for the decode phase of large-model inference; the chip is manufactured by Samsung on a 4nm process and is slated to ship in the third quarter of 2026.^[56]

Corporate Leadership

Who is the CEO of Nvidia?

Jensen Huang is the co-founder, president, and CEO of Nvidia, and has led the company since its founding in 1993, making him one of the longest-serving CEOs in the technology industry. He holds a bachelor's degree in electrical engineering from Oregon State University and a master's degree from Stanford University.

Huang's leadership style emphasizes long-term technical bets. The decision to invest in CUDA in 2006, years before deep learning became mainstream, is frequently cited as one of the most prescient strategic decisions in technology history. Under Huang's leadership, Nvidia pivoted from a gaming-focused GPU company to the dominant platform company for artificial intelligence.

Huang's personal net worth, derived almost entirely from his approximately 3.5% stake in Nvidia, was reported at $191.5 billion on Forbes' real-time billionaires list in mid-May 2026, making him among the world's ten wealthiest individuals.^[47]

Chief Financial Officer Colette Kress has served in that role since 2013. SVP of GPU engineering Jonah Alben is among the most senior technical executives, while Jay Puri, EVP of Worldwide Field Operations, oversees go-to-market. Manuvir Das, who led Nvidia's enterprise computing business and launched the NVIDIA AI Enterprise software suite, left the company in April 2025 to become an operating partner at infrastructure investor Stonepeak; Western Digital appointed him to its board of directors in May 2026.^[57]

References

"Nvidia Corporation." Wikipedia. https://en.wikipedia.org/wiki/nvidia
"Jensen Huang." NVIDIA Newsroom. https://nvidianews.nvidia.com/bios/jensen-huang
"Our History: Innovations Over the Years." NVIDIA. https://www.nvidia.com/en-us/about-nvidia/corporate-timeline/
"NVIDIA Announces Financial Results for Fourth Quarter and Fiscal 2026." NVIDIA Newsroom, February 2026. https://nvidianews.nvidia.com/news/nvidia-announces-financial-results-for-fourth-quarter-and-fiscal-2026 ↩
"NVIDIA Announces Financial Results for Fourth Quarter and Fiscal 2025." NVIDIA Newsroom. https://nvidianews.nvidia.com/news/nvidia-announces-financial-results-for-fourth-quarter-and-fiscal-2025 ↩
"Evolution of NVIDIA Data Center GPUs: From Pascal to Grace Blackwell." ServerSimply. https://www.serversimply.com/blog/evolution-of-nvidia-data-center-gpus
"NVIDIA Data Center GPU Specs: A Complete Comparison Guide." IntuitionLabs. https://intuitionlabs.ai/articles/nvidia-data-center-gpu-specs
"Comparing Blackwell vs Hopper: A Deep Dive GPU Architecture Comparison." IntuitionLabs. https://intuitionlabs.ai/articles/blackwell-vs-hopper-gpu-architecture-comparison
"NVIDIA's CUDA Moat: How Developer Lock-In Built a Trillion-Dollar AI Empire." Medium, February 2026.
"How Nvidia dominated AI and plans to keep it that way as generative AI explodes." VentureBeat.
"Nvidia DGX." Wikipedia. https://en.wikipedia.org/wiki/Nvidia_DGX
"NVIDIA Announces DGX H100 Systems." NVIDIA Newsroom. https://nvidianews.nvidia.com/news/nvidia-announces-dgx-h100-systems-worlds-most-advanced-enterprise-ai-infrastructure
"DGX Platform: Built for Enterprise AI." NVIDIA. https://www.nvidia.com/en-us/data-center/dgx-platform/
"NVIDIA Rubin R100: Specs, Architecture, and GPU Cloud Availability." Spheron Blog. https://www.spheron.network/blog/nvidia-rubin-r100-guide/
"The Rubin Revolution: Nvidia Unveils Next-Generation 'Vera Rubin' AI Architecture at GTC 2026." FinancialContent, March 2026.
"NVIDIA (NVDA) Market Capitalization." CompaniesMarketCap. https://companiesmarketcap.com/nvidia/marketcap/ ↩
"Nvidia briefly touched $4 trillion market cap for first time." CNBC, July 2025. ↩
"AMD's MI350: The AI Accelerator That Could Challenge Nvidia's Dominance In 2026." Seeking Alpha.
"Nvidia sales are 'off the charts,' but Google, Amazon and others now make their own custom AI chips." CNBC, November 2025.
"U.S. Export Controls and China: Advanced Semiconductors." Congressional Research Service. https://www.congress.gov/crs-product/R48642
"Trump Lifted the AI Chip Ban on China, Clearing Nvidia and AMD to Resume Sales." Built In.
"NVIDIA Advances Open Model Development for Digital and Physical AI." NVIDIA Blog. https://blogs.nvidia.com/blog/neurips-open-source-digital-physical-ai/
"NVIDIA Rubin Platform, Open Models, Autonomous Driving: CES 2026." NVIDIA Blog. https://blogs.nvidia.com/blog/2026-ces-special-presentation/
"AWS and NVIDIA deepen strategic collaboration to accelerate AI." AWS Blog. https://aws.amazon.com/blogs/machine-learning/aws-and-nvidia-deepen-strategic-collaboration-to-accelerate-ai-from-pilot-to-production/
"NVIDIA GPU Market Share 2024-2026." Silicon Analysts. https://siliconanalysts.com/analysis/nvidia-ai-accelerator-market-share-2024-2026
"NVIDIA RTX 5090 vs. RTX 4090: Comparison, benchmarks for AI, LLM Workloads." BIZON. https://bizon-tech.com/blog/nvidia-rtx-5090-comparison-gpu-benchmarks-for-ai
"Nvidia becomes first public company worth $5 trillion." TechCrunch, October 29, 2025. https://techcrunch.com/2025/10/29/nvidia-becomes-first-public-company-worth-5-trillion/ ↩
"NVDA Becomes First Company To Hit $5.5 Trillion Market Cap." StockTwits / Yahoo Finance, May 2026. ↩
"NVIDIA Announces Financial Results for Third Quarter Fiscal 2026." NVIDIA Newsroom, November 2025. https://nvidianews.nvidia.com/news/nvidia-announces-financial-results-for-third-quarter-fiscal-2026 ↩
"NVIDIA Kicks Off the Next Generation of AI With Rubin: Six New Chips, One Incredible AI Supercomputer." NVIDIA Investor Relations, March 2026. https://investor.nvidia.com/news/press-release-details/2026/NVIDIA-Kicks-Off-the-Next-Generation-of-AI-With-Rubin--Six-New-Chips-One-Incredible-AI-Supercomputer/default.aspx ↩
"NVIDIA Vera Rubin Opens Agentic AI Frontier." NVIDIA Newsroom. https://nvidianews.nvidia.com/news/nvidia-vera-rubin-platform ↩
"NVIDIA Unveils Rubin CPX: A New Class of GPU Designed for Massive-Context Inference." NVIDIA Newsroom, September 2025. https://nvidianews.nvidia.com/news/nvidia-unveils-rubin-cpx-a-new-class-of-gpu-designed-for-massive-context-inference ↩
"NVIDIA Announces DGX Spark and DGX Station Personal AI Computers." NVIDIA Newsroom, March 2025. https://nvidianews.nvidia.com/news/nvidia-announces-dgx-spark-and-dgx-station-personal-ai-computers ↩
"NVIDIA Announces Spectrum-X Photonics, Co-Packaged Optics Networking Switches to Scale AI Factories to Millions of GPUs." NVIDIA Newsroom. https://nvidianews.nvidia.com/news/nvidia-spectrum-x-co-packaged-optics-networking-switches-ai-factories ↩
"NVIDIA Introduces Spectrum-XGS Ethernet to Connect Distributed Data Centers Into Giga-Scale AI Super-Factories." NVIDIA Investor Relations, 2025. https://investor.nvidia.com/news/press-release-details/2025/NVIDIA-Introduces-Spectrum-XGS-Ethernet-to-Connect-Distributed-Data-Centers-Into-Giga-Scale-AI-Super-Factories/default.aspx ↩
"NVIDIA 1st Quarter FY27 Financial Results." NVIDIA Investor Relations, May 2026. https://investor.nvidia.com/events-and-presentations/events-and-presentations/event-details/2026/NVIDIA-1st-Quarter-FY27-Financial-Results/default.aspx ↩
"NVIDIA and CoreWeave Strengthen Collaboration to Accelerate Buildout of AI Factories." NVIDIA Newsroom, January 2026. https://nvidianews.nvidia.com/news/nvidia-and-coreweave-strengthen-collaboration-to-accelerate-buildout-of-ai-factories ↩
"OpenAI and NVIDIA Announce Strategic Partnership to Deploy 10 Gigawatts of NVIDIA Systems." NVIDIA Newsroom, September 2025. https://nvidianews.nvidia.com/news/openai-and-nvidia-announce-strategic-partnership-to-deploy-10gw-of-nvidia-systems ↩
"Nvidia CFO admits the $100 billion OpenAI megadeal 'still' isn't 'definitive'." Fortune, December 2025. https://fortune.com/2025/12/02/nvidia-openai-deal-not-signed-yet-100-billion-rally-colette-kress/ ↩
"Announcing The Stargate Project." OpenAI, January 21, 2025. https://openai.com/index/announcing-the-stargate-project/ ↩
"Microsoft, NVIDIA and Anthropic Announce Strategic Partnerships." NVIDIA Blog, November 2025. https://blogs.nvidia.com/blog/microsoft-nvidia-anthropic-announce-partnership/ ↩
"Nvidia discloses that U.S. will limit sales of advanced chips to China after all." NPR, April 16, 2025. https://www.npr.org/2025/04/16/nx-s1-5366665/nvidia-china-h20-chips-exports ↩
"Nvidia says it will resume H20 AI chip sales to China 'soon,' following U.S. government assurances." CNBC, July 15, 2025. https://www.cnbc.com/2025/07/15/nvidia-says-us-government-will-allow-it-to-resume-h20-ai-chip-sales-to-china.html ↩
"NVIDIA Launches Cosmos World Foundation Model Platform to Accelerate Physical AI Development." NVIDIA Newsroom, January 6, 2025. https://nvidianews.nvidia.com/news/nvidia-launches-cosmos-world-foundation-model-platform-to-accelerate-physical-ai-development ↩
"NVIDIA Announces Isaac GR00T N1: the World's First Open Humanoid Robot Foundation Model." NVIDIA Newsroom, March 18, 2025. https://nvidianews.nvidia.com/news/nvidia-isaac-gr00t-n1-open-humanoid-robot-foundation-model-simulation-frameworks ↩
"Nvidia completes $700 million Run:ai acquisition after regulatory hurdles." Reuters / Investing.com, December 30, 2024. ↩
"Jensen Huang." Forbes Real-Time Billionaires. https://www.forbes.com/profile/jensen-huang/ ↩
"Transcript: NVIDIA CEO Jensen Huang's Keynote at GTC 2025." NVIDIA / Rev, March 2025. https://www.rev.com/transcripts/gtc-keynote-with-nvidia-ceo-jensen-huang ↩
"Nvidia (NVDA) Q3 2026 Earnings Call Transcript." The Motley Fool, November 19, 2025. https://www.fool.com/earnings/call-transcripts/2025/11/19/nvidia-nvda-q3-2026-earnings-call-transcript/ ↩
"Jensen Huang Maps the AI Factory Era at NVIDIA GTC 2026." Data Center Frontier, March 2026. https://www.datacenterfrontier.com/machine-learning/news/55364406/jensen-huang-maps-the-ai-factory-era-at-nvidia-gtc-2026 ↩
"NVIDIA Announces Financial Results for First Quarter Fiscal 2027." NVIDIA Investor Relations, May 20, 2026. https://investor.nvidia.com/news/press-release-details/2026/NVIDIA-Announces-Financial-Results-for-First-Quarter-Fiscal-2027/default.aspx ↩
"Nvidia CEO Huang says $30 billion OpenAI investment 'might be the last'." CNBC, March 4, 2026. https://www.cnbc.com/2026/03/04/nvidia-huang-openai-investment.html ↩
"OpenAI closes funding round at an $852 billion valuation." CNBC, March 31, 2026. https://www.cnbc.com/2026/03/31/openai-funding-round-ipo.html ↩
"Nvidia buying AI chip startup Groq's assets for about $20 billion in its largest deal on record." CNBC, December 24, 2025. https://www.cnbc.com/2025/12/24/nvidia-buying-ai-chip-startup-groq-for-about-20-billion-biggest-deal.html ↩
"Nvidia's $20 billion Groq IP deal bolsters AI market domination." Tom's Hardware, December 2025. https://www.tomshardware.com/tech-industry/semiconductors/nvidia-confirms-20-billion-groq-deal-to-bolster-ai-inference-dominance ↩
"How Nvidia's $20 billion Groq 3 LPU deal reshapes the Nvidia Vera Rubin Platform." Tom's Hardware, March 2026. https://www.tomshardware.com/tech-industry/semiconductors/nvidias-20-billion-groq-deal-produces-its-first-chip ↩
"WD Appoints Manuvir Das to Board of Directors." Western Digital, May 28, 2026. https://www.westerndigital.com/company/newsroom/press-releases/2026/2026-05-28-wd-appoints-manuvir-das-to-board-of-directors ↩
"Nvidia Hits Record $5.5 Trillion Value, First Company To Ever Reach Mark." Forbes, May 13, 2026. https://www.forbes.com/sites/antoniopequenoiv/2026/05/13/nvidia-hits-record-55-trillion-value-first-company-to-ever-reach-mark/ ↩
"NVIDIA Announces Financial Results for First Quarter Fiscal 2026." NVIDIA Investor Relations, May 28, 2025. https://investor.nvidia.com/news/press-release-details/2025/NVIDIA-Announces-Financial-Results-for-First-Quarter-Fiscal-2026/default.aspx ↩
"Nvidia and AMD will give US 15% of China sales. But Chinese state media warns about their chips." CNN, August 11, 2025. https://www.cnn.com/2025/08/11/china/us-china-trade-nvidia-chips-intl-hnk ↩
"NVIDIA Announces Major Release of Cosmos World Foundation Models and Physical AI Data Tools." NVIDIA Newsroom, March 18, 2025. https://nvidianews.nvidia.com/news/nvidia-announces-major-release-of-cosmos-world-foundation-models-and-physical-ai-data-tools ↩
"DGX B200: The Foundation for Your AI Factory." NVIDIA. https://www.nvidia.com/en-us/data-center/dgx-b200/ ↩
"Nvidia GTC 2026: CEO Jensen Huang sees $1 trillion in orders for Blackwell and Vera Rubin through '27." CNBC, March 16, 2026. https://www.cnbc.com/2026/03/16/nvidia-gtc-2026-ceo-jensen-huang-keynote-blackwell-vera-rubin.html ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

10 revisions by 1 contributors · full history

Suggest edit