NVIDIA DGX Cloud

AI Infrastructure NVIDIA

12 min read

Updated Jun 28, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 28, 2026

Fact-checked

In review queue

Sources

11 citations

Revision

v2 · 2,367 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

NVIDIA DGX Cloud is a managed AI-supercomputing-as-a-service offering from Nvidia that rents enterprises access to multi-node clusters of NVIDIA DGX infrastructure plus the NVIDIA AI software stack over the internet, on a monthly per-instance subscription rather than an outright hardware purchase. Announced at the company's GTC conference on March 21, 2023, it was pitched as a way for "every enterprise" to reach an AI supercomputer "from a browser," removing the procurement, deployment and operational burden of building on-premises clusters. ^[1] DGX Cloud is hosted inside the data centers of third-party cloud and colocation partners (Oracle Cloud first, then Microsoft Azure, Google Cloud, and later AWS) as well as on NVIDIA's own infrastructure, yet the offering, pricing and customer relationship are owned by NVIDIA. The service marked NVIDIA's most direct move into selling accelerated computing as a service, layering a software-and-services business on top of the GPU hardware it already sold to those same cloud providers. ^[2]

Over its first three years the offering broadened considerably, then shifted again. It expanded from training-focused clusters to a portfolio that added DGX Cloud Serverless Inference for deploying models, and DGX Cloud Lepton, a global GPU marketplace launched in 2025 that grew out of NVIDIA's acquisition of the startup Lepton AI. By late 2025 NVIDIA had restructured the original first-party rental business, folding it into its core engineering organization and refocusing most of its capacity on internal research and development rather than competing head-on with the hyperscalers. ^[11]

What is NVIDIA DGX Cloud?

NVIDIA DGX Cloud is NVIDIA's branded, first-party AI-supercomputing service: dedicated clusters of NVIDIA DGX systems, paired with NVIDIA AI software, that an enterprise rents by the month and reaches over the network. Each launch instance was configured as an eight-GPU node bundled with NVIDIA's software for managing and developing AI workloads, and the service was positioned for the training of large, multi-node models, including the large language models and other generative AI systems that were driving demand for NVIDIA's chips at the time. ^[1] Crucially, NVIDIA owns the customer relationship and the pricing even when the underlying hardware sits inside a partner's cloud, which is what distinguishes DGX Cloud from simply renting GPUs from a hyperscaler. ^[2]

The GTC 2023 announcement

NVIDIA unveiled DGX Cloud during chief executive Jensen Huang's keynote at GTC on March 21, 2023, framing the moment in characteristically expansive terms. "We are at the iPhone moment of AI," Huang said. "Startups are racing to build disruptive products and business models, and incumbents are looking to respond. DGX Cloud gives customers instant access to NVIDIA AI supercomputing in global-scale clouds." ^[1]

The core proposition was that an enterprise could rent a dedicated cluster of NVIDIA DGX systems on a monthly basis and reach it through a web browser, sidestepping the long lead times, capital expenditure and engineering effort required to acquire and run scarce accelerated computing hardware in-house. Each instance was configured as an eight-GPU node and came bundled with NVIDIA's software for managing and developing AI workloads. NVIDIA positioned the service for the training of large, multi-node models, including the large language models and other generative AI systems that were driving demand for its chips at the time. ^[1]

How does DGX Cloud work? The rent-the-infrastructure model

DGX Cloud inverted NVIDIA's traditional relationship with the cloud. Rather than only selling GPUs to hyperscalers who then resold capacity, NVIDIA placed its own DGX hardware inside partner data centers, wrapped it in NVIDIA software, and sold the bundle directly to enterprises as a first-party service. The pricing model and the customer billing relationship belonged to NVIDIA even when the technology was accessed through a partner's cloud marketplace. ^[2]

Analysts read the strategy as a deliberate shift toward software and services revenue. NVIDIA was, as one assessment put it, "not rich enough, or dumb enough" to build a cloud to rival Amazon Web Services, Microsoft Azure or Google Cloud, so it used those clouds as a distribution channel while capturing recurring, higher-margin software income on top of hardware sales. The arrangement also let NVIDIA sell the same GPUs twice in economic terms: once to the cloud provider building the data center, and again, through DGX Cloud subscriptions, to the enterprise consuming the capacity. ^[2]

Which clouds host DGX Cloud?

At launch NVIDIA named a specific set of hosting partners rather than the full roster of major hyperscalers. Oracle Cloud Infrastructure was first to make DGX Cloud generally available, paired with a purpose-built remote direct memory access (RDMA) network, bare-metal compute and high-performance storage scaling to tens of thousands of GPUs. The colocation provider Equinix also offered the service from its data centers. NVIDIA said Microsoft Azure and Google Cloud would follow. ^[1]^[3]

Notably, Amazon Web Services was not among the original DGX Cloud hosts. AWS and NVIDIA announced a DGX Cloud collaboration later, on November 28, 2023, under which AWS would host the service. That deployment was described as the first DGX Cloud built on the GH200 NVL32 multi-node platform based on the NVIDIA GH200 Grace Hopper Superchip, with a single Amazon EC2 instance able to provide up to 20 terabytes of shared memory for terabyte-scale workloads. The same collaboration produced Project Ceiba, a supercomputer for NVIDIA's own research featuring 16,384 GH200 Superchips and rated at 65 exaflops of AI performance. ^[4] DGX Cloud subsequently became available to purchase through AWS Marketplace, a step covered at AWS re:Invent in 2024. ^[5]

Partner	Role	Status / timing
Oracle Cloud Infrastructure	Hosting cloud	First generally available, March 2023
Equinix	Colocation host	Available at launch, March 2023
Microsoft Azure	Hosting cloud	Announced as coming, 2023
Google Cloud	Hosting cloud	Announced as coming, 2023
Amazon Web Services	Hosting cloud	DGX Cloud collaboration announced November 2023; on AWS Marketplace by 2024

What software does DGX Cloud include?

A defining feature of DGX Cloud was that the rented hardware shipped with NVIDIA's full enterprise AI software, so customers did not have to assemble their own stack. The bundle centered on the NVIDIA Base Command Platform, which handled cluster management, job scheduling and orchestration of large training runs, and NVIDIA AI Enterprise, the supported suite of more than a hundred frameworks, libraries and pretrained models, including the RAPIDS data-science libraries. ^[1]^[3]

Alongside the launch NVIDIA introduced cloud services that ran on this infrastructure, including NeMo for building and customizing large language models, Picasso for image, video and 3D generation, and BioNeMo for drug discovery and the language of proteins. These frameworks let DGX Cloud customers fine-tune and deploy generative AI models without leaving NVIDIA's environment. ^[1]

Early adopters cited by NVIDIA spanned several industries: the biotechnology firm Amgen for drug discovery, CCC Intelligent Solutions for AI in insurance claims, and ServiceNow for AI research and enterprise software. ^[1]

How much does DGX Cloud cost?

NVIDIA set the entry price for DGX Cloud at 36,999 US dollars per instance per month. Each instance comprised eight NVIDIA H100 or A100 80GB Tensor Core GPUs, for a total of 640GB of GPU memory per node, plus the bundled Base Command Platform and AI Enterprise software. ^[1]^[3] Independent coverage confirmed the figure and noted the unusual commercial structure: customers paid NVIDIA directly even when they reached the service through a partner cloud's marketplace, which shielded NVIDIA from the margin compression it would have faced as a mere hardware supplier and preserved its pricing power. ^[2] The premium price later became a competitive pressure point, as hyperscalers cut their own H100 and A100 rental rates well below NVIDIA's first-party list price. ^[11]

Why does DGX Cloud matter? Strategic significance

DGX Cloud crystallized a broader repositioning of NVIDIA from a chip vendor into a full-stack AI platform company that increasingly monetized software and services. By the time of the launch, analysts observed that NVIDIA had more employees working on software than on hardware, and DGX Cloud represented a "cloud-first" turn that let the company sell recurring access to its technology rather than only one-time hardware. ^[2]

The model was also strategically delicate. NVIDIA's hosting partners were simultaneously its largest customers for GPUs and, through their own AI services, its competitors for enterprise AI spending. By owning the DGX Cloud relationship and price while running on partner infrastructure, NVIDIA inserted itself into the value chain without building a hyperscale cloud of its own, capturing software revenue that was independent of who owned the underlying data center. ^[2] The approach foreshadowed NVIDIA's later expansion into adjacent services and acquisitions, including the cluster-management company Run:ai, as it built out the software layer around its accelerated computing.

What is DGX Cloud Lepton? Serverless Inference and the marketplace pivot

As the AI market shifted from a near-exclusive focus on training toward large-scale inference and agentic applications, DGX Cloud grew into a family of offerings.

At GTC in March 2025 NVIDIA introduced DGX Cloud Serverless Inference, built on NVIDIA Cloud Functions. Positioned as a horizontal aggregator, it abstracts the underlying infrastructure across AWS, Azure, Google Cloud, private clouds and on-premises data centers, and provides auto-scaling, global load balancing and multi-cloud deployment for production AI workloads. Developers can deploy models packaged as NVIDIA NIM microservices, custom containers or Helm charts, and the service was opened to independent software vendors and NVIDIA Cloud Partners. ^[6]

The most significant addition was DGX Cloud Lepton, announced on May 18, 2025, at Computex. According to NVIDIA, "NVIDIA DGX Cloud Lepton connects the world's developers building agentic and physical AI applications with tens of thousands of GPUs, available from a global network of cloud providers." ^[7] Developers can either purchase capacity from participating providers or bring their own clusters, and the platform integrates the NVIDIA software stack, including NIM and NeMo microservices, NVIDIA Blueprints and NVIDIA Cloud Functions. For cloud providers, it adds management software with real-time GPU health diagnostics and automated root-cause analysis. ^[7]

DGX Cloud Lepton grew directly out of NVIDIA's acquisition of Lepton AI, a startup that rented NVIDIA GPU servers and built cloud management software. The deal, reported at several hundred million dollars and covering a team of roughly 20 people, closed in April 2025; Lepton AI's co-founders Yangqing Jia, the creator of the Caffe deep-learning framework and a former vice president at Alibaba, and Junjie Bai joined NVIDIA. The acquired technology and team became the foundation for the rebranded marketplace, extending NVIDIA's cloud and software ambitions against the dominance of the major hyperscalers. ^[8]^[9]

The initial set of NVIDIA Cloud Partners contributing NVIDIA Blackwell and other-architecture GPUs to the DGX Cloud Lepton marketplace included CoreWeave, Crusoe, Firmus, Foxconn, GMI Cloud, Lambda, Nebius, Nscale, SoftBank Corp. and Yotta Data Services, with AWS and Microsoft Azure named as the first large-scale cloud providers to participate. ^[7] On June 11, 2025, NVIDIA expanded DGX Cloud Lepton in Europe, adding regional providers including Mistral AI, Nebius, Nscale, Firebird, Fluidstack, Hydra Host, Scaleway and Together AI, and partnering with venture firms such as Accel, Elaia, Partech and Sofinnova Partners to offer eligible startups up to 100,000 US dollars in GPU capacity credits. ^[10]

What changed in late 2025? Restructuring toward internal R&D

By the end of 2025, NVIDIA had stepped back from running DGX Cloud as a public cloud meant to compete directly with hyperscalers. Reporting by The Information, summarized in early January 2026, said NVIDIA had folded the DGX Cloud business into its core engineering organization under Dwight Diercks, the senior vice president who oversees software engineering, and was no longer offering the first-party rental platform to new customers. ^[11] Most of the capacity was redirected toward internal uses, including AI model development, software validation, and pre-silicon and post-silicon testing of new GPU platforms. ^[11]

The shift made strategic sense for the reasons that had always made DGX Cloud delicate: NVIDIA's biggest GPU customers were the same hyperscalers it would have had to undercut as a landlord of data center racks. Rather than be that landlord, NVIDIA leaned on DGX Cloud Lepton to route external workloads across a marketplace of cloud partners, while keeping the DGX Cloud brand alive internally as an engineering platform. ^[11]

Offering	Introduced	Purpose
DGX Cloud (clusters)	March 2023	Rented multi-node DGX clusters for large-scale training
DGX Cloud on AWS (GH200 NVL32)	November 2023	Grace Hopper based instances with large shared memory
DGX Cloud Serverless Inference	March 2025	Multi-cloud, auto-scaling inference via Cloud Functions
DGX Cloud Lepton	May 2025	Global GPU marketplace aggregating many cloud providers
DGX Cloud (restructured)	Late 2025	Refocused on internal R&D and chip testing; closed to new external customers

References

NVIDIA. "NVIDIA Launches DGX Cloud, Giving Every Enterprise Instant Access to AI Supercomputer From a Browser." NVIDIA Newsroom, March 21, 2023. https://nvidianews.nvidia.com/news/nvidia-launches-dgx-cloud-giving-every-enterprise-instant-access-to-ai-supercomputer-from-a-browser ↩
Timothy Prickett Morgan. "Nvidia Bends The Clouds To Its Own Financial Will." The Next Platform, March 21, 2023. https://www.nextplatform.com/2023/03/21/nvidia-bends-the-clouds-to-its-own-financial-will/ ↩
"NVIDIA DGX Cloud." Oracle Cloud Marketplace listing, Oracle Corporation. https://cloudmarketplace.oracle.com/marketplace/en_US/listing/154619827 ↩
NVIDIA. "AWS and NVIDIA Announce Strategic Collaboration to Offer New Supercomputing Infrastructure, Software and Services for Generative AI." NVIDIA Newsroom, November 28, 2023. https://nvidianews.nvidia.com/news/aws-nvidia-strategic-collaboration-for-generative-ai ↩
"Nvidia DGX Cloud now available via AWS." Data Center Dynamics, 2024. https://www.datacenterdynamics.com/en/news/nvidia-dgx-cloud-now-available-via-aws/ ↩
"NVIDIA Announces DGX Cloud Serverless Inference at GTC 2025." C# Corner, March 2025. https://www.c-sharpcorner.com/news/nvidia-announces-dgx-cloud-serverless-inference-at-gtc-20252 ↩
NVIDIA. "NVIDIA Announces DGX Cloud Lepton to Connect Developers to NVIDIA's Global Compute Ecosystem." NVIDIA Newsroom, May 18, 2025. https://nvidianews.nvidia.com/news/nvidia-announces-dgx-cloud-lepton-to-connect-developers-to-nvidias-global-compute-ecosystem ↩
"NVIDIA acquires Chinese GPU cloud startup Lepton AI: report." TechNode, April 9, 2025. https://technode.com/2025/04/09/nvidia-acquires-chinese-gpu-cloud-startup-lepton-ai-report/ ↩
"Nvidia launches 'Lepton' AI platform connecting GPU cloud providers' resources." Data Center Dynamics, May 2025. https://www.datacenterdynamics.com/en/news/nvidia-launches-lepton-ai-platform-connecting-gpu-cloud-providers-resources/ ↩
NVIDIA. "NVIDIA DGX Cloud Lepton Connects Europe's Developers to Global NVIDIA Compute Ecosystem." NVIDIA Newsroom, June 11, 2025. https://nvidianews.nvidia.com/news/nvidia-dgx-cloud-lepton-connects-europes-developers-to-global-nvidia-compute-ecosystem ↩
"Nvidia restructures DGX Cloud business, seemingly shifts focus to internal R&D." Data Center Dynamics, January 2, 2026 (reporting attributed to The Information). https://www.datacenterdynamics.com/en/news/nvidia-restructures-dgx-cloud-business-seemingly-shifts-focus-to-internal-rd/ ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

1 revision by 1 contributors · full history

Suggest edit

What links here

Lepton AI NVIDIA Picasso Run:ai ServiceNow

What is NVIDIA DGX Cloud?

The GTC 2023 announcement

How does DGX Cloud work? The rent-the-infrastructure model

Which clouds host DGX Cloud?

What software does DGX Cloud include?

How much does DGX Cloud cost?

Why does DGX Cloud matter? Strategic significance

What is DGX Cloud Lepton? Serverless Inference and the marketplace pivot

What changed in late 2025? Restructuring toward internal R&D

See also

References

Improve this article

Related Articles

CUDA

NVIDIA Picasso

NVIDIA H100

NCCL (NVIDIA Collective Communications Library)

NVIDIA B200

NVIDIA GB300 NVL72

What links here

Related Articles

CUDA

NVIDIA Picasso

NVIDIA H100

NCCL (NVIDIA Collective Communications Library)

NVIDIA B200

NVIDIA GB300 NVL72

What links here