Lepton AI

AI Companies AI Infrastructure NVIDIA

23 min read

Updated Jul 12, 2026

Suggest edit History Talk

RawGraph

Last edited

Jul 12, 2026

Fact-checked

In review queue

Sources

31 citations

Revision

v2 · 4,588 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Lepton AI
Type	Subsidiary of NVIDIA (since 2025); formerly private company
Industry	Artificial intelligence, Cloud computing, AI inference
Founded	2023
Founders	Yangqing Jia (CEO), Junjie Bai
Headquarters	Cupertino / San Francisco Bay Area, California, United States
Key people	Yangqing Jia (Co-founder; Vice President of DGX Cloud at NVIDIA after acquisition), Junjie Bai (Co-founder)
Products	Lepton Cloud, Photon (Python SDK), Tuna LLM engine; (post-acquisition) NVIDIA DGX Cloud Lepton
Number of employees	Approximately 20 (at time of acquisition)
Parent	NVIDIA (April 2025 onward)
Website	lepton.ai

Lepton AI was an American AI cloud company, founded in 2023, that built a cloud-native inference platform for serving large language models, generative image models, and other AI workloads on NVIDIA GPUs.^[1] In April 2025 NVIDIA acquired the roughly 20-person startup for a reported several hundred million dollars and used its team and technology to launch NVIDIA DGX Cloud Lepton, a multi-cloud GPU marketplace unveiled the following month at Computex 2025.^[7]^[8]^[9] The company was founded by Yangqing Jia, the creator of the Caffe deep learning framework and a former vice president of Alibaba Cloud, and Junjie Bai, a former technical lead at Microsoft and Facebook who was a primary contributor to the Open Neural Network Exchange (ONNX) project.^[2]^[3]

Lepton AI raised an $11 million seed round in May 2023, co-led by CRV and Fusion Fund.^[4] Over the next two years the company built a managed serving platform around its Python-centric "Photon" abstraction and an in-house inference engine called Tuna, attracting roughly twenty employees and a customer base that consisted largely of venture capital backed AI startups.^[1]^[5]

In late March 2025, reports emerged that NVIDIA was in advanced talks to acquire Lepton AI for a sum estimated in the high hundreds of millions of dollars.^[6]^[7] The deal closed in April 2025, with both co-founders and the engineering team moving to NVIDIA.^[8] On May 18, 2025, at the Computex trade show in Taipei, NVIDIA chief executive Jensen Huang announced NVIDIA DGX Cloud Lepton, a multi-cloud GPU marketplace platform built on the acquired Lepton team's technology.^[9]^[10] At the launch, Huang said the service "connects our network of global GPU cloud providers with AI developers," adding, "Together with our NCPs, we're building a planetary-scale AI factory."^[9] The launch marked one of the most visible product debuts to emerge from an NVIDIA acquisition and confirmed that the Lepton team had become a central part of NVIDIA's cloud services strategy. Yangqing Jia took on the role of Vice President of DGX Cloud at NVIDIA following the close of the acquisition.^[11]

Who founded Lepton AI?

Origins of the company

Lepton AI was incorporated in 2023 in the San Francisco Bay Area, with its initial business operations registered at addresses in Cupertino and Palo Alto, California.^[12] The company was founded shortly after Yangqing Jia departed Alibaba in March 2023, ending a four-year tenure during which he had risen to vice president of the company and president of its cloud computing platform group.^[13] Jia was joined by Junjie Bai, a former colleague from his Facebook AI Infrastructure team who had also worked on the AI platform group at Alibaba.^[2]

The pair set out to build what they described as a "cloud-native AI" platform, focused specifically on production inference rather than model training. By 2023, the boom in generative AI had created enormous demand for inexpensive, low-latency serving of transformer based models. Existing cloud platforms offered raw virtual machines and container orchestration, but developers building AI products had to assemble their own stack of model servers, schedulers, autoscalers, observability, and GPU management. Lepton AI's pitch was that this entire stack should be abstracted away behind a simple Python interface that turned any model into a service with a few lines of code.^[14]

Naming

The company name "Lepton" is borrowed from particle physics, where leptons are a family of elementary particles that includes the electron. The choice signaled the founders' goal of building lightweight, fundamental serving primitives for AI workloads. The branding extended to the company's open-source SDK, which used "Photon" as the name for its core abstraction, and to its inference engine "Tuna," which loosely follows the pattern of lightweight, fast objects.^[14]

Who are Yangqing Jia and Junjie Bai?

Yangqing Jia

Yangqing Jia (Chinese name Jia Yangqing) earned a Bachelor's degree in automation (summa cum laude) and a Master's degree in control science and engineering from Tsinghua University, then completed a Ph.D. in computer science at the University of California, Berkeley under the supervision of Trevor Darrell at the Berkeley Artificial Intelligence Research (BAIR) lab.^[15]

During his Berkeley doctorate, Jia created the open-source Caffe deep learning framework, which was released to the public in December 2013. Caffe (an acronym for Convolutional Architecture for Fast Feature Embedding) was one of the first widely adopted open-source deep learning libraries and became a standard tool for computer vision researchers in the mid-2010s.^[16] Written in C++ with a Python interface and released under a BSD license, Caffe predated TensorFlow and PyTorch and influenced their designs in significant ways.^[16]

After graduating from Berkeley in 2014, Jia joined Google as a research scientist at Google Brain, where he worked on TensorFlow's internal development and co-authored the paper introducing the Inception architecture and the GoogLeNet model that won the ImageNet Large Scale Visual Recognition Challenge in 2014.^[15] In February 2016 he moved to Facebook (now Meta) as a research scientist, eventually becoming Director of AI Architecture. At Facebook, Jia led the development of Caffe2, served as one of the co-leads on PyTorch 1.0, and was a co-creator of ONNX.^[17]

Jia left Facebook in 2019 to join Alibaba Group as Vice President of Technology. At Alibaba, he led the company's platform-of-AI (PAI) team and eventually served as president of the Computing Platform group within Alibaba Cloud, where he oversaw a broad portfolio of data analytics, AI, and serverless products.^[13] He departed Alibaba in March 2023, and announced via social media that he would pursue a new venture in the AI infrastructure space. That venture became Lepton AI.^[13]

Junjie Bai

Junjie Bai holds a Bachelor of Science degree from the Technical University of Munich.^[18] Prior to co-founding Lepton AI he held engineering and platform leadership roles at the high-frequency trading firm KCG Holdings, then at Facebook, and later at Alibaba Group, where he served as Director of the AI Platform team.^[18]

Bai is best known for his work on ONNX, the Open Neural Network Exchange. ONNX was announced jointly by Facebook and Microsoft on September 7, 2017, as an open file format for representing neural networks that allowed models to move between frameworks such as PyTorch, Caffe2, and CNTK without rewriting code.^[19] Bai is listed as the first author on the foundational ONNX technical report and was one of the principal Facebook engineers responsible for the project's design and rollout.^[20] ONNX v1.0 was released in December 2017 with broader industry support, and the project later became a Linux Foundation AI graduated project supported by AWS, IBM, Intel, AMD, Arm, Qualcomm, and Huawei, among others.^[19]

Together, the founding pair brought a rare combination of credentials that spanned the entire modern deep learning toolchain: framework creation (Caffe), framework interoperability (ONNX), large-scale production AI infrastructure at Facebook and Alibaba, and direct involvement in flagship efforts such as PyTorch and TensorFlow. This pedigree made Lepton AI attractive to investors and customers despite the company's small size and early stage.

What was Lepton Cloud?

Platform architecture

Lepton AI's core product was a managed AI Platform as a Service (PaaS) that wrapped raw GPU infrastructure inside a simpler developer abstraction. The platform was built on top of Kubernetes and offered features that compete more directly with hosted inference services than with traditional infrastructure-as-a-service offerings.^[14]

At the center of the platform was a Python-first abstraction called Photon. A Photon was a Python class describing a model and its dependencies; with a few lines of code, a developer could convert research and modeling code into a deployable service complete with HTTP endpoints, request batching, autoscaling, and observability hooks. The SDK was published on GitHub under the leptonai/leptonai repository as open source, described there as "A Pythonic framework to simplify AI service building"; the repository accumulated roughly 2,800 stars, lowering the barrier to trial use.^[21]

Photons could be deployed as long-running endpoints, as ephemeral batch jobs, or as scheduled background workers. Lepton Cloud provided prebuilt Photons for common open-source models including LLaMA, Stable Diffusion XL (SDXL), and OpenAI's Whisper speech model, so that customers could spin up a hosted endpoint for these models in seconds.^[14]

Tuna inference engine

For large language model serving specifically, Lepton AI built a proprietary inference engine called Tuna, which combined techniques such as dynamic batching, quantization, and custom CUDA kernels. Lepton publicly claimed that Tuna sustained more than 600 output tokens per second on common benchmark configurations and that a single Lepton deployment could serve in excess of 20 billion tokens and 1 million images per day at peak utilization.^[5] These numbers placed the engine in the same performance class as competing engines such as vLLM, TensorRT-LLM, and the proprietary engines used by Together AI and Fireworks AI.

Enterprise features

Lepton AI marketed itself heavily to enterprise customers, attaining SOC 2 Type II and HIPAA compliance early in its life. The platform offered features such as private networking, customer-managed encryption keys, VPC peering, and dedicated GPU pools. For image generation workloads, the team integrated their own implementation of DistriFusion, a research technique that distributes diffusion sampling across multiple GPUs and was reported to deliver roughly six times faster high-resolution image generation than a single-GPU baseline.^[5]

Search demonstration

In January 2024 the company released search_with_lepton, an open-source demonstration that combined a hosted LLM on Lepton Cloud with a web search API to create a conversational search experience similar to Perplexity AI.^[22] The repository accumulated more than 8,000 stars and about 1,000 forks on GitHub and served as a marketing vehicle for the underlying Lepton Cloud; its README highlighted that a working conversational search engine could be built "using less than 500 lines of code," showcasing how a small team could deploy a production-grade AI application end to end.^[22]

How was Lepton AI funded?

Lepton AI ran a lean fundraising history with only a single publicly disclosed equity round before the NVIDIA acquisition.

Date	Round	Amount	Lead investors	Other participants
May 2023	Seed	$11 million	CRV, Fusion Fund	HSG (formerly Sequoia China), individual angels
April 2025	Acquisition	Estimated several hundred million USD	NVIDIA	N/A

The May 2023 seed round was announced shortly after the company was founded and was led by CRV and Fusion Fund, with additional participation from HSG and a handful of angel investors.^[4]^[23] The round closed at a relatively small size compared with peers raising in the same period, reflecting both the small team size and a deliberate strategy by the founders to keep equity dilution low while the platform matured.^[4]

Although early press accounts of the NVIDIA acquisition occasionally suggested that Andreessen Horowitz (a16z) had invested in Lepton, public funding records show that the seed round was led by CRV and Fusion Fund rather than a16z. The misattribution likely stemmed from confusion with other Yangqing Jia-affiliated efforts and with the broader infrastructure-startup funding landscape of 2023.^[4]^[23]

Lepton AI did not raise a publicly disclosed Series A. Instead, its next material financing event was the April 2025 acquisition by NVIDIA, which gave investors an unusually fast return on their capital. While neither party disclosed an exact purchase price, multiple outlets including The Information, TechCrunch, and Bloomberg reported the deal to be in the high hundreds of millions of US dollars.^[6]^[7]^[8]

Why did NVIDIA acquire Lepton AI?

Negotiation and announcement

Reports of advanced acquisition talks between NVIDIA and Lepton AI surfaced on March 26, 2025, when TechCrunch cited people familiar with the discussions to say that the chip maker was nearing a deal worth several hundred million US dollars.^[7] Coverage by SiliconANGLE, Reuters, and Bloomberg followed in the days afterward.^[6]^[24] On April 8, 2025, The Information reported that the transaction had closed and that the entire Lepton AI team, including both co-founders, had moved to NVIDIA.^[8] Industry coverage placed the headline value of the deal at "hundreds of millions of dollars," though neither company filed a precise purchase price publicly.^[8]

Strategic rationale

The Lepton acquisition fit into a broader pattern of acquisitions and tuck-ins that NVIDIA had been pursuing in early 2025 as it expanded from a chip vendor into a vertically integrated AI computing company. Just weeks before the Lepton talks were reported, NVIDIA had agreed to acquire the synthetic data startup Gretel.^[7] In the years leading up to 2025 the company had also acquired OctoAI (an inference startup founded by University of Washington faculty), Run:ai (a Kubernetes GPU scheduler), and Brev.dev (a developer-experience tool for GPU computing).^[25]

For NVIDIA, Lepton brought three things at once. First, a credible team of systems engineers with deep experience operating production AI infrastructure at hyperscale (Facebook, Alibaba, Google). Second, a working multi-tenant control plane for GPU workloads that could plausibly form the substrate of a multi-cloud GPU service. Third, the personal credibility of Yangqing Jia, whose long history in the deep learning community made him an attractive face for NVIDIA's emerging cloud strategy. Jensen Huang would later cite Lepton specifically when describing how NVIDIA planned to position itself in the "AI factory" market against pure cloud providers.^[9]

Reception and skepticism

Not all observers were sanguine about the deal. Short-seller Jim Chanos publicly described NVIDIA's reported Lepton acquisition as a "huge red flag" and argued that NVIDIA was buying its way into the demand side of its own market to obscure potential inventory risk. He framed the move alongside other channel-stuffing concerns that had been raised about the broader AI infrastructure boom.^[26] NVIDIA's leadership did not respond directly to those criticisms, instead emphasizing the strategic and product implications of the deal at Computex two months later.

What is NVIDIA DGX Cloud Lepton?

Computex 2025 announcement

On May 18, 2025, NVIDIA chief executive Jensen Huang delivered the keynote at Computex 2025 in Taipei. Among a long list of product announcements, Huang introduced NVIDIA DGX Cloud Lepton, describing it as an AI platform with a compute marketplace that connects developers building agentic and physical AI applications with tens of thousands of GPUs available from a global network of cloud providers.^[9]^[10]

DGX Cloud Lepton was positioned as a unified entry point to NVIDIA's broader compute ecosystem. Developers would interact with a single web console and API surface that abstracted away the underlying provider, even though the actual GPU capacity could be drawn from any of a growing list of NVIDIA Cloud Partners (NCPs). The platform aimed to ease problems that had become acute during 2023 to 2025, including geographic GPU shortages, lack of price transparency across providers, and the difficulty of building data-sovereignty-aware applications that need to keep workloads in specific countries or regions.^[9]

Participating cloud providers

At launch, NVIDIA announced a roster of participating clouds that drew from the global NVIDIA Cloud Partner network. The initial set spanned established U.S. neoclouds, emerging providers in Asia and Europe, and several large industrial conglomerates that had recently entered the cloud GPU business.

Cloud provider	Headquarters region	Notes
CoreWeave	United States	Largest dedicated U.S. AI cloud at the time
Lambda (Lambda Labs)	United States	Long-standing GPU cloud and workstation vendor
Crusoe	United States	Built early infrastructure on stranded natural-gas power
Firmus	Australia	Sustainable AI factory operator
Foxconn	Taiwan	Industrial manufacturing partner of NVIDIA
GMI Cloud	United States	AI cloud focused on global GPU distribution
Nebius	Netherlands	Spun out of Yandex's post-divestiture entity
Nscale	United Kingdom	European AI compute provider
SoftBank Corp.	Japan	Telecom operator deploying NVIDIA Blackwell
Yotta Data Services	India	Largest GPU cloud in India at launch

AWS and Microsoft Azure were named as the first large-scale cloud providers to join the marketplace.^[29] Additional providers were added in the months after launch, including Hugging Face (via its Training Cluster as a Service), Together AI, Mistral AI Compute, Scaleway, Fluidstack, and Firebird, expanding the platform's reach into Europe and into the broader open-source AI ecosystem.^[27]^[28] On June 11, 2025, NVIDIA announced a European expansion, "NVIDIA DGX Cloud Lepton Connects Europe's Developers," adding regional partners, a focus on sovereign AI workloads, and up to $100,000 in GPU capacity credits per eligible startup offered through European venture firms Accel, Elaia, Partech, and Sofinnova Partners.^[29] By mid-2026 the marketplace had grown to more than 25 participating cloud providers.^[28]

Software stack and features

DGX Cloud Lepton integrates closely with NVIDIA's existing AI software products. The platform exposes NVIDIA NIM microservices (containerized inference packages for popular foundation models), the NeMo framework for model customization, Cloud Functions for serverless agent workloads, and NVIDIA Blueprints for reference applications.^[27]

Core user-facing features include:

Feature	Description
Dev Pods	Interactive environments with Jupyter notebooks, SSH, and Visual Studio Code for exploratory work
Batch Jobs	Multi-node training, fine-tuning, and data-preprocessing pipelines
Inference Endpoints	Auto-scaling deployment surfaces for production AI services
Marketplace	Discovery and purchase of GPU capacity in specific regions across providers
Health Monitoring	GPU diagnostics via NCCL tests, GPUd metrics, and burn-in validation
Auto Recovery	Provider-side root-cause analysis and customer-configurable workload restart logic
Sovereignty Controls	Region-pinned compute placement to comply with national data regulations
Bring Your Own Capacity	Customers can import their own GPU clusters into the unified control plane

A core engineering thesis of DGX Cloud Lepton, drawn directly from the original Lepton AI platform, is that the user experience of running AI workloads should not depend on the underlying GPU cloud. Customers should be able to discover available capacity, run a workload, monitor its health, and roll it over to a different provider without re-architecting their stack.^[27]

The platform entered early access in June 2025 and progressively expanded through the second half of 2025 and into 2026, integrating new cloud partners and new geographic regions.^[27]

Strategic significance

DGX Cloud Lepton represented a noticeable evolution of NVIDIA's cloud strategy. NVIDIA's previous DGX Cloud offering had been a "managed AI training service" that ran on a small number of hyperscaler partners, including Microsoft Azure, Oracle Cloud, Google Cloud, and AWS. The new Lepton-based product was much broader: it took the same brand and extended it to a marketplace structure that included many smaller "neoclouds," industrial cloud players, and regional providers.^[9]^[30] In effect, NVIDIA was using Lepton's software to coordinate a federation of independent GPU clouds, each of which was already buying NVIDIA hardware in volume, and to offer that federation to developers as if it were a single product.

Commentators including The Wall Street Journal and several industry analysts observed that this strategy could put NVIDIA in a more direct competitive position with hyperscale clouds such as Microsoft Azure, AWS, and Google Cloud, while also strengthening NVIDIA's downstream relationships with the neoclouds that consume the bulk of its high-end Blackwell systems.^[30]^[31]

What is Yangqing Jia's role at NVIDIA?

Following the close of the acquisition, both co-founders moved to NVIDIA along with the rest of the Lepton AI team.^[8] Yangqing Jia was named Vice President of DGX Cloud at NVIDIA, where he leads the team responsible for the DGX Cloud Lepton product line and the broader cloud-services portfolio that wraps NVIDIA hardware.^[11] In public appearances and conference talks during 2025 and 2026, Jia continued to position himself as a builder of AI infrastructure for developers, often referencing his prior work on Caffe and PyTorch as motivation for the design choices in DGX Cloud Lepton.^[11]

Junjie Bai joined NVIDIA in a senior engineering role on the same team, where his ONNX background remained relevant to NVIDIA's broader interoperability strategy across PyTorch, JAX, TensorRT, and the company's expanding portfolio of inference software.^[8]

Who were Lepton AI's competitors?

Lepton AI competed in an unusually crowded subset of the AI infrastructure market: cloud-native inference platforms that abstract GPU operations behind serverless APIs. The category had attracted significant venture funding through 2023 and 2024, producing a long list of differentiated entrants. Even after the NVIDIA acquisition, many of these companies continued to operate independently, often alongside DGX Cloud Lepton as participating clouds or as direct alternatives.

Competitor	Founders / Year	Primary positioning
Together AI	Vipul Ved Prakash et al., 2022	Open-source model serving, fine-tuning, and RedPajama datasets
Fireworks AI	Lin Qiao et al., 2022	High-performance LLM inference platform spun out of Meta PyTorch
Anyscale	Ion Stoica et al., 2019	Commercial backer of the Ray distributed framework
Baseten	Tuhin Srivastava et al., 2019	Model serving with Truss SDK and dedicated deployments
Replicate	Ben Firshman, Andreas Jansson, 2019	Open model marketplace; Cog packaging tool
Modal	Erik Bernhardsson, 2021	Serverless Python and GPU compute platform
DeepInfra	Nikola Borisov et al., 2022	Low-cost open-source model API hosting
OctoAI	Luis Ceze et al., 2019	Inference platform; acquired by NVIDIA in 2024
SambaNova Systems	Rodrigo Liang et al., 2017	Custom RDU chips and on-prem AI appliances
Cerebras Systems	Andrew Feldman et al., 2016	Wafer-scale CS-3 systems and inference cloud

OctoAI deserves particular comparison, as it was a similarly positioned inference platform also acquired by NVIDIA (in 2024). The acquisitions of OctoAI and then Lepton AI within roughly one year of each other underscored NVIDIA's commitment to building a serving stack of its own rather than relying on independent partners alone. Within NVIDIA, elements of both teams contributed to NVIDIA NIM, DGX Cloud Lepton, and adjacent inference products.^[25]

Several of Lepton's listed competitors went on to become participating clouds in DGX Cloud Lepton. Together AI, for example, joined the platform after Lepton's acquisition, meaning that NVIDIA's new product effectively aggregated and routed customer demand to one of Lepton's former rivals. This dynamic was characteristic of the AI infrastructure market in the mid-2020s: rather than a winner-take-all outcome, the layer of "GPU clouds" remained fragmented and complementary, while orchestration and serving layers consolidated under a smaller number of vendors, of which NVIDIA was now a major one.^[28]

What is Lepton AI's legacy?

Lepton AI's independent life lasted under two years, but its imprint on the AI infrastructure landscape proved disproportionately large. The company's Photon SDK influenced subsequent designs for Python-native serving libraries. Its founders' work on Caffe, PyTorch, and ONNX continued to underpin essentially every AI inference platform in the market, including those of Lepton's direct competitors. And through DGX Cloud Lepton, the platform's design ideas reached a far larger user base than the original Lepton Cloud ever did.

For NVIDIA, the acquisition served as a template for how a hardware company could climb the software stack: identify an opinionated team with a working product, acquire it for a fraction of the cost of building from scratch, and rebrand the resulting platform under the parent company's marketing umbrella. The DGX Cloud Lepton launch at Computex 2025 was, by several reporters' accounts, one of the most successful debuts of an acquired AI startup in recent memory.^[9]^[30]

References

SiliconANGLE: "Report: Nvidia close to acquiring AI cloud provider Lepton AI in nine-figure deal." Retrieved from https://siliconangle.com/2025/03/27/report-nvidia-close-acquiring-ai-cloud-provider-lepton-ai-nine-figure-deal/. Accessed 2026-05-20. ↩
TechNode: "NVIDIA acquires Chinese GPU cloud startup Lepton AI: report." Retrieved from https://technode.com/2025/04/09/nvidia-acquires-chinese-gpu-cloud-startup-lepton-ai-report/. Accessed 2026-05-20. ↩
Yangqing Jia personal website (daggerfs.com) and LinkedIn profile. Retrieved from https://www.linkedin.com/in/yangqing-jia/. Accessed 2026-05-20. ↩
Lepton AI announcement of $11 million seed round (May 2023). Retrieved from https://www.leadsontrees.com/news/lepton-ai-secures-11m-seed-round-to-fuel-a-next-gen-ai-application-platform-revolution. Accessed 2026-05-20. ↩
Lepton AI product documentation and platform overview. Retrieved from https://www.lepton.ai/docs/advanced/prebuilt_photons and https://eliteai.tools/tool/lepton-ai. Accessed 2026-05-20. ↩
SiliconANGLE: NVIDIA-Lepton acquisition reporting. Retrieved from https://siliconangle.com/2025/03/27/report-nvidia-close-acquiring-ai-cloud-provider-lepton-ai-nine-figure-deal/. Accessed 2026-05-20. ↩
TechCrunch: "Nvidia is reportedly in talks to acquire Lepton AI" (26 March 2025). Retrieved from https://techcrunch.com/2025/03/26/nvidia-is-reportedly-in-talks-to-acquire-lepton-ai/. Accessed 2026-05-20. ↩
AIBase: "Nvidia Completes Acquisition of Lepton AI; Former Alibaba VP Yangqing Jia Joins with His Team" (April 2025). Retrieved from https://www.aibase.com/news/16924. Accessed 2026-05-20. ↩
NVIDIA Newsroom: "NVIDIA Announces DGX Cloud Lepton to Connect Developers to NVIDIA's Global Compute Ecosystem" (18 May 2025, Computex). Retrieved from https://nvidianews.nvidia.com/news/nvidia-announces-dgx-cloud-lepton-to-connect-developers-to-nvidias-global-compute-ecosystem. Accessed 2026-05-20. ↩
insideAI News: "NVIDIA Announces DGX Cloud Lepton for GPU Access across Multi-Cloud Platforms" (19 May 2025). Retrieved from https://insideainews.com/2025/05/19/nvidia-announces-dgx-cloud-lepton-for-gpu-access-across-multi-cloud-platforms/. Accessed 2026-05-20. ↩
Yangqing Jia LinkedIn profile, listing title as "VP of DGX Cloud at NVIDIA / Co-founder & CEO of Lepton AI (now part of NVIDIA)." Retrieved from https://www.linkedin.com/in/yangqing-jia/. Accessed 2026-05-20. ↩
Lepton AI corporate profile records. Retrieved from https://www.crunchbase.com/organization/lepton-ai and https://pitchbook.com/profiles/company/528456-25. Accessed 2026-05-20. ↩
Synced Review and ChinaTechNews coverage of Yangqing Jia's career transitions. Retrieved from https://syncedreview.com/2019/03/05/caffe-pioneer-ai-infrastructure-director-leaves-facebook/ and https://www.chinatechnews.com/2019/03/20/26223-silicon-valley-scientist-jia-yangqing-joined-alibaba. Accessed 2026-05-20. ↩
Lepton AI blog: "Build AI the Easy Way." Retrieved from https://blog.lepton.ai/build-ai-the-easy-way-2a8b68c63723. Accessed 2026-05-20. ↩
Yangqing Jia biographical materials. Retrieved from https://www.allamericanspeakers.com/celebritytalentbios/Yangqing+Jia/453830 and https://baike.baidu.com/en/item/Yangqing%20Jia/45509. Accessed 2026-05-20. ↩
Caffe (software) Wikipedia article and Berkeley Vision project page. Retrieved from https://en.wikipedia.org/wiki/Caffe_(software) and https://caffe.berkeleyvision.org/. Accessed 2026-05-20. ↩
Synced Review: "Caffe Pioneer & AI Infrastructure Director Leaves Facebook" (5 March 2019). Retrieved from https://syncedreview.com/2019/03/05/caffe-pioneer-ai-infrastructure-director-leaves-facebook/. Accessed 2026-05-20. ↩
Junjie Bai LinkedIn and corporate-data profiles. Retrieved from https://www.linkedin.com/in/junjiebai/ and https://rocketreach.co/junjie-bai-email_43357565. Accessed 2026-05-20. ↩
ONNX (Open Neural Network Exchange) Wikipedia article. Retrieved from https://en.wikipedia.org/wiki/Open_Neural_Network_Exchange. Accessed 2026-05-20. ↩
Meta Research blog: "ONNX V1 released" (December 2017). Retrieved from https://research.facebook.com/blog/2017/12/onnx-v1-released/. Accessed 2026-05-20. ↩
leptonai/leptonai GitHub repository. Retrieved from https://github.com/leptonai/leptonai. Accessed 2026-07-12. ↩
leptonai/search_with_lepton GitHub repository. Retrieved from https://github.com/leptonai/search_with_lepton. Accessed 2026-07-12. ↩
PitchBook profile for Lepton AI. Retrieved from https://pitchbook.com/profiles/company/528456-25. Accessed 2026-05-20. ↩
Data Center Dynamics: "Nvidia in talks to acquire server rental company Lepton AI." Retrieved from https://www.datacenterdynamics.com/en/news/nvidia-in-talks-to-acquire-server-rental-company-lepton-ai-report/. Accessed 2026-05-20. ↩
Michael Parekh: "AI: Nvidia moves into AI Cloud Services market." Retrieved from https://michaelparekh.substack.com/p/ai-nvidia-moves-into-ai-cloud-services. Accessed 2026-05-20. ↩
Yahoo Finance: "Nvidia's Reported Lepton AI Buyout A 'Huge Red Flag,' Says Short Seller Jim Chanos." Retrieved from https://finance.yahoo.com/news/nvidias-reported-lepton-ai-buyout-180039987.html. Accessed 2026-05-20. ↩
NVIDIA Technical Blog: "Introducing NVIDIA DGX Cloud Lepton: A Unified AI Platform Built for Developers." Retrieved from https://developer.nvidia.com/blog/introducing-nvidia-dgx-cloud-lepton-a-unified-ai-platform-built-for-developers/. Accessed 2026-05-20. ↩
NVIDIA DGX Cloud Lepton product page. Retrieved from https://www.nvidia.com/en-us/data-center/dgx-cloud-lepton/. Accessed 2026-07-12. ↩
NVIDIA Newsroom: "NVIDIA DGX Cloud Lepton Connects Europe's Developers to Global NVIDIA Compute Ecosystem" (11 June 2025). Retrieved from https://nvidianews.nvidia.com/news/nvidia-dgx-cloud-lepton-connects-europes-developers-to-global-nvidia-compute-ecosystem. Accessed 2026-07-12. ↩
Windows Forum / Illustrated AI: "Nvidia pivots DGX Cloud to Lepton marketplace, reshaping AI compute strategy." Retrieved from https://windowsforum.com/threads/nvidia-pivots-dgx-cloud-to-lepton-marketplace-reshaping-ai-compute-strategy.380743/ and https://illustrated-ai.com/nvidia-launches-dgx-cloud-lepton-a-gpu-marketplace-linking-developers-to-multiple-ai-cloud-providers/. Accessed 2026-05-20. ↩
efficientlyconnected.com: "NVIDIA Unveils DGX Cloud Lepton." Retrieved from https://www.efficientlyconnected.com/nvidia-launches-dgx-cloud-lepton-to-power-global-ai-compute-marketplace/. Accessed 2026-05-20. ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

1 revision by 1 contributor · full history

Suggest edit

What links here

NVIDIA DGX Cloud Run:ai fal.ai

Who founded Lepton AI?

Origins of the company

Naming

Who are Yangqing Jia and Junjie Bai?

Yangqing Jia

Junjie Bai

What was Lepton Cloud?

Platform architecture

Tuna inference engine

Enterprise features

Search demonstration

How was Lepton AI funded?

Why did NVIDIA acquire Lepton AI?

Negotiation and announcement

Strategic rationale

Reception and skepticism

What is NVIDIA DGX Cloud Lepton?

Computex 2025 announcement

Participating cloud providers

Software stack and features

Strategic significance

What is Yangqing Jia's role at NVIDIA?

Who were Lepton AI's competitors?

What is Lepton AI's legacy?

See also

References

Improve this article

Related Articles

CUDA

NVIDIA Picasso

NVIDIA H100

NCCL (NVIDIA Collective Communications Library)

NVIDIA B200

NVIDIA GB300 NVL72

What links here

Related Articles

CUDA

NVIDIA Picasso

NVIDIA H100

NCCL (NVIDIA Collective Communications Library)

NVIDIA B200

NVIDIA GB300 NVL72

What links here