Lambda Labs
Last reviewed
May 17, 2026
Sources
30 citations
Review status
Source-backed
Revision
v2 ยท 6,543 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 17, 2026
Sources
30 citations
Review status
Source-backed
Revision
v2 ยท 6,543 words
Add missing citations, update stale details, or suggest a clearer explanation.
Lambda Labs (operating as Lambda, Inc.) is an American AI infrastructure company that provides GPU cloud computing, on-premises GPU hardware, and deep learning software for artificial intelligence research and development. Headquartered in San Francisco, California, Lambda operates what it calls a "Superintelligence Cloud," offering on-demand GPU instances, large-scale multi-node clusters, and an inference API to academic institutions, AI startups, and large enterprises. The company was originally founded in 2012 as a computer vision and facial recognition software provider before pivoting to GPU cloud infrastructure around 2017 to 2018. By early 2026, Lambda served more than 150,000 cloud users and reported an annualized revenue run rate of approximately $760 million, growing at roughly 79 percent year over year. Lambda is widely regarded as one of the largest pure-play GPU neocloud providers, often grouped with CoreWeave, Crusoe, and a handful of other specialists that own and operate their own NVIDIA accelerator fleets rather than reselling hyperscaler capacity.
Lambda was founded in March 2012 by twin brothers Stephen Balaban and Michael Balaban in San Jose, California. Stephen Balaban studied computer science and economics at the University of Michigan and had previously been an early engineer at Perceptio, a startup focused on on-device neural networks for facial recognition that Apple acquired in 2015 (after Stephen had already left to start Lambda). Michael Balaban completed a double major at the University of Michigan in discrete mathematics and computer science, graduated a year after his brother, and joined Nextdoor as a software engineer on the infrastructure team before coming on full time at Lambda in March 2015. Michael Balaban became the company's CTO while Stephen served as CEO, a division of roles the brothers maintained into the mid-2020s.
The company's initial product focus was facial recognition software. An early trigger for their first commercial offering came in June 2012 when Facebook acquired Face.com and shut down its popular facial recognition API, stranding roughly 45,000 developers who had built applications on top of it. Lambda moved quickly to launch the Lambda Face API as an alternative, attracting more than 1,000 active developers within a year and processing over 5 million API calls per month. During this period the company also built a Google Glass face-identification application that gained notable traction. From roughly 2012 to 2016, facial recognition APIs and computer vision tooling were Lambda's primary revenue source.
The Balaban brothers have described this period as foundational rather than glamorous. Lambda ran lean, taking on consulting projects to keep the lights on while shipping a steady cadence of computer vision products. The company also experimented with hardware adjacent to its software, including wearable face-recognition prototypes. Although the early business never scaled to the level of consumer giants, it gave the founders deep familiarity with GPU-accelerated deep learning workloads, which would later become central to Lambda's pivot.
As deep learning workloads became more GPU-intensive, the Balaban brothers found that running their own machine learning models on AWS was prohibitively expensive and operationally cumbersome. They built an internal GPU cluster to support their own workloads, then recognized that other AI teams faced the same friction. Between 2017 and 2019, Lambda pivoted away from computer vision software and into AI hardware and GPU infrastructure. Stephen Balaban has often described the pivot as a serendipitous discovery: while the founders were trying to lower their own training costs, they realized the deep learning community at large would pay a premium for properly configured GPU systems with curated software stacks.
The company introduced its first dedicated deep learning workstations and GPU servers during this period, including the Lambda Quad workstation and Lambda Blade server. It also launched the Lambda Echelon, a turn-key on-premises GPU cluster that shipped pre-configured with compute, storage, and networking in rack form. Lambda opened its GPU cloud service publicly in 2018, making it one of the first cloud platforms dedicated specifically to deep learning workloads. The Lambda Stack, a curated suite of pre-installed deep learning software, launched alongside the cloud service to reduce setup time for practitioners.
Lambda's early customer base in this period included academic groups at universities such as Stanford, MIT, and Caltech, as well as a growing list of research labs at large technology companies. The company's positioning emphasized that researchers could rent the same kind of multi-GPU hardware they would otherwise have to purchase outright, and that the curated Lambda Stack removed many days of CUDA, driver, and framework debugging from a typical setup.
Through 2020 to 2023, Lambda expanded its cloud GPU catalog to cover a broader range of NVIDIA architectures, including A100 and H100 GPUs, and built out its 1-Click Clusters product to serve teams that needed multi-node distributed training at short notice. The company reported growing adoption by universities, government research labs, and frontier AI companies during this period. Lambda's cloud revenue overtook its on-premises hardware business during this stretch, although hardware continued to be a meaningful contributor through 2024.
In July 2024, Lambda launched what it described as the first self-serve, on-demand NVIDIA HGX H100 clusters with NVIDIA Quantum-2 InfiniBand networking, lowering the barrier for teams to run distributed training without negotiating long-term contracts or managing InfiniBand configuration themselves. Industry observers noted that on-demand InfiniBand-class clusters were rare in the market at the time, with most other providers requiring multi-month contracts and white-glove onboarding.
Despite strong growth, Lambda faced repeated criticism in 2024 for GPU availability shortfalls. Users reported that H100 and A100 instances were frequently listed as "temporarily unavailable," with some documenting success rates for same-day A100 provisioning as low as 64 percent over a six-month window. The shortages reflected industry-wide constraints on advanced NVIDIA supply during the 2023 to 2024 generative AI boom, but they also strained Lambda's reputation among small developers who could not get reserved capacity.
On March 25, 2025, Lambda announced that its primary web domain was migrating from lambdalabs.com to lambda.ai, reflecting the company's legal name, Lambda, Inc., and its positioning as a pure AI infrastructure provider. The former domain continued to redirect traffic during the transition period.
In August 2025, Lambda formally ended sales and support for its on-premises hardware product lines, including the Vector, Vector One, and Vector Pro workstations as well as the Scalar and Hyperplane servers. Customers with existing hardware remained covered by their original warranty terms. The decision concentrated the company's resources on cloud services and reflected a strategic shift toward operating its own data center capacity rather than shipping boxes to customer premises.
In September 2025, NVIDIA signed a $1.5 billion agreement to lease back 18,000 GPUs from Lambda over four years, making NVIDIA Lambda's largest single customer. The deal, reported by Data Center Dynamics and confirmed by multiple outlets, was structured as approximately $1.3 billion for 10,000 higher-end GPUs and $200 million for an additional 8,000 units. The agreement illustrated ongoing GPU scarcity at the industry level: even NVIDIA found it practical to rent compute from a cloud provider rather than operate its own data center capacity for all internal workloads. Industry analysts described the deal as a powerful signal of demand: NVIDIA, the supplier, was renting its own accelerators back from a customer who had received priority allocations.
Also in September 2025, Lambda and ECL brought the first hydrogen-powered NVIDIA GB300 NVL72 systems online at ECL's Mountain View campus, a zero-water and zero-emissions modular facility powered entirely by hydrogen fuel cells. The deployment was notable both for the chip generation, Blackwell Ultra, and for the unusual energy source. The Supermicro-built GB300 NVL72 racks consumed roughly 142 kilowatts each and used direct-to-chip liquid cooling.
In October 2025, Lambda announced plans to establish a major AI factory in Kansas City, Missouri. The facility is expected to launch in early 2026 with 24 megawatts of capacity and the potential to scale beyond 100 megawatts over time. The Kansas City site, a renovated former bank data center originally built in 2009, will house more than 10,000 Blackwell Ultra GPUs at launch and is operated as a single-tenant supercomputer for a multi-year Lambda customer. The investment is reported to exceed $500 million.
In November 2025, Lambda announced a multibillion-dollar, multi-year agreement with Microsoft to deploy AI infrastructure powered by tens of thousands of NVIDIA GPUs, including GB300 NVL72 systems. The deal positioned Lambda as a wholesale capacity provider supplying Azure with specialized GPU infrastructure rather than competing with the hyperscaler directly. Lambda and Microsoft had been collaborating in some form for more than eight years before the November 2025 announcement, but the new contract represented a step change in scale. Initial infrastructure deployment was expected to begin in 2026.
Later in November 2025, Lambda announced a Series E funding round of more than $1.5 billion led by TWG Global, with participation from US Innovative Technology Fund and other investors. The round valued Lambda at approximately $5.9 billion post-money. Reports indicated Lambda was also in discussions to raise an additional pre-IPO bridge round of roughly $350 million, with Mubadala Capital in talks to lead at a roughly 20 percent discount to the eventual IPO price.
By early 2026, Lambda had announced plans for gigawatt-scale AI factories and added support for NVIDIA GB300 NVL72 and Vera Rubin NVL72 Superclusters. At NVIDIA GTC 2026, Lambda announced bare-metal instances on NVIDIA Vera Rubin NVL72, a production-scale GB300 NVL72 Supercluster using NVIDIA Quantum-X Photonics co-packaged optics, and one of the largest planned deployments of Quantum-X InfiniBand switches to date, in an AI factory containing more than 10,000 GB300 GPUs.
In May 2026, the company announced a major leadership reorganization. Michel Combes, formerly president of SoftBank Group International and CEO of Sprint, was named Lambda's chief executive officer. Co-founder Stephen Balaban transitioned to chief technology officer, where he would lead technology strategy and product full time. Co-founder Michael Balaban became chief product officer. John Donovan, the former CEO of AT&T Communications, was named chairman of the board. Charles Fisher, previously CFO at Turo and a senior finance executive at Charter Communications, joined as CFO. Leonard Speiser was named chief operating officer, Jerry Hunter (a former COO of Snap and early AWS infrastructure leader) joined as vice chairman of compute delivery, and David Connolly, formerly general counsel at Altice, was appointed chief legal officer. Lambda said the new team was assembled to scale its AI infrastructure footprint to roughly 3 gigawatts by 2030.
Following the May 2026 reorganization, Lambda's executive team and board leadership stood as follows.
| Role | Name | Background |
|---|---|---|
| Chief executive officer | Michel Combes | Former president of SoftBank Group International; former CEO of Sprint and Altice |
| Chief technology officer | Stephen Balaban | Co-founder; former CEO; ex-Perceptio engineer |
| Chief product officer | Michael Balaban | Co-founder; former CTO; former Nextdoor infrastructure engineer |
| Chairman of the board | John Donovan | Former CEO of AT&T Communications |
| Vice chairman of compute delivery | Jerry Hunter | Former COO of Snap; early AWS infrastructure leader |
| Chief operating officer | Leonard Speiser | Operating executive across multiple growth-stage companies |
| Chief financial officer | Charles Fisher | Former CFO of Turo; senior finance at Charter Communications |
| Chief legal officer | David Connolly | Former general counsel at Altice |
The restructuring split co-founder roles between technology and product (Stephen and Michael Balaban) and brought in operating leaders from large telecommunications and consumer technology businesses to manage the scale-out from a single-digit gigawatt fleet toward a multi-gigawatt platform. Several commentators noted that Lambda's executive hiring profile resembled that of a near-IPO telecommunications or data center operator more than that of a traditional cloud startup.
Lambda has raised approximately $2.36 billion in equity funding across five named rounds, plus additional debt facilities and a planned pre-IPO bridge.
| Round | Date | Amount | Valuation | Key investors |
|---|---|---|---|---|
| Series A | July 2021 | Undisclosed | ~$87.5M | Gradient Ventures, others |
| Series B | March 2023 | $44M | Undisclosed | Mercato Partners, Greg Brockman, Garry Tan |
| Series C | February 2024 | $320M | $1.5B | US Innovative Technology Fund (Thomas Tull), B Capital, SK Telecom, Mercato Partners |
| Series D | February 2025 | $480M | $2.5B | Andra Capital, SGW, NVIDIA, ARK Invest |
| Series E | November 2025 | $1.5B+ | $5.9B | TWG Global, US Innovative Technology Fund |
| Pre-IPO bridge (reported) | Q1 2026 (in discussion) | ~$350M | TBD | Mubadala Capital (reported lead) |
In addition to equity rounds, Lambda secured a $500 million debt facility from Macquarie Group in April 2024 and a $275 million credit facility from JPMorgan in August 2025, providing capital to purchase and finance GPU inventory ahead of customer contracts. Debt-financed GPU acquisition has been a common pattern across the neocloud category, with CoreWeave using similar structures to fund its inventory before customer cash flow materialized.
NVIDIA participated directly in the Series D, establishing a formal investor relationship that complemented the GPU supply agreements Lambda held. The Series E was led by TWG Global, an investment firm founded by Thomas Tull and Mark Walter (the Guggenheim Partners CEO and Los Angeles Lakers co-owner) and backed in part by Abu Dhabi's Mubadala Capital, which anchors TWG's roughly $15 billion AI-focused fund.
By late 2025, Lambda was in discussions with investment banks including Morgan Stanley, JPMorgan, and Citi about a potential initial public offering. Sacra Research estimated Lambda's valuation at the time of the Series E at roughly 7.8 times its trailing annual revenue, compared to CoreWeave's 23.4 times multiple following its 2025 IPO. Several analysts read the Series E as a bridge into a public offering targeted for the second half of 2026.
Lambda Cloud is the company's primary revenue-generating product, providing on-demand access to NVIDIA GPU instances with hourly (or per-minute) billing, no egress fees, and pre-installed Lambda Stack software. The platform supports SSH access, persistent filesystems, and integration with standard orchestration tools including Kubernetes, Slurm, and dstack.
Instance types range from older NVIDIA Quadro RTX 6000 and Tesla V100 cards through A6000, A10, A100, GH200, H100, and B200 configurations. All instances include persistent storage and come pre-configured with the Lambda Stack deep learning environment.
Lambda Cloud is built around an asset-heavy model: the company owns the GPUs it rents out rather than reselling hyperscaler capacity. The company claims GPU availability in 97 percent of US universities and more than 50,000 machine learning teams globally using its stack. By early 2026, Sacra Research estimated Lambda Cloud accounted for roughly 80 percent of company revenue, with the remainder split between residual hardware contracts, the inference API, and private cloud reservations.
1-Click Clusters is Lambda's managed multi-node cluster product, designed for distributed AI training and large-scale inference. The product was introduced in its current form in mid-2024, following earlier iterations that required more manual configuration.
Clusters are built on NVIDIA HGX B200 SXM6 or HGX H100 nodes interconnected with NVIDIA Quantum-2 InfiniBand networking and SHARP (Scalable Hierarchical Aggregation and Reduction Protocol) acceleration, providing 3,200 Gbps of aggregate bandwidth per node. Cluster sizes range from 16 GPUs up to 2,000 or more, with self-service provisioning available through the Lambda dashboard. Commitment terms run from two weeks to one year, with per-GPU hourly pricing declining at larger scales and longer terms.
Lambda markets the product with managed Kubernetes or Slurm orchestration, S3-compatible storage integration, and SOC 2 Type II security certification. The company positions 1-Click Clusters as a faster alternative to negotiated enterprise GPU contracts, emphasizing that clusters can be provisioned in minutes rather than weeks. Several frontier AI labs have used 1-Click Clusters for experimental training runs that did not justify the latency of negotiated reserved capacity.
Superclusters is Lambda's highest-tier product, introduced in 2025 for frontier AI training at scale. Superclusters use NVIDIA GB300 NVL72 and, announced for later in 2026, NVIDIA Vera Rubin NVL72 systems. Each GB300 NVL72 rack integrates 72 NVIDIA Blackwell Ultra GPUs and 36 NVIDIA Grace CPUs, with 37 TB of fast memory and 130 TB/s of NVLink Switch bandwidth per rack.
The product targets frontier model training workloads that require deterministic network behavior across tens of thousands of GPUs. Lambda and ECL brought the first hydrogen-powered GB300 NVL72 systems online in September 2025. The Microsoft infrastructure agreement announced in November 2025 included GB300 NVL72 deployments at multiple Lambda data centers.
At NVIDIA GTC 2026 in March, Lambda announced that it would begin deploying NVIDIA Vera Rubin NVL72 systems as bare-metal instances in the second half of 2026. The Vera Rubin platform features 72 Rubin GPUs and 36 Vera CPUs per rack, with NVIDIA citing up to 5 times greater inference performance and 10 times lower cost per token than NVIDIA Blackwell for comparable workloads. Lambda also disclosed plans to incorporate NVIDIA Quantum-X Photonics co-packaged optics switches into a production-scale Supercluster, in what NVIDIA described as one of the largest deployments of the new optical interconnect to date.
For enterprises with compliance, data sovereignty, or air-gap requirements, Lambda offers dedicated single-tenant GPU clusters that are physically isolated from shared infrastructure. Private cloud deployments typically start at 1,000 or more GPUs and are priced through direct sales engagement. Lambda targets regulated industries including finance, healthcare, pharmaceutical research, aerospace, and defense, as well as U.S. government agencies. The Kansas City AI factory announced in late 2025 is the largest known single-tenant private cloud deployment in Lambda's history, with more than 10,000 Blackwell Ultra GPUs dedicated to a single customer.
Lambda launched a serverless LLM inference API offering OpenAI-compatible endpoints for popular open-weight models. The service was positioned on low per-token pricing and claimed to be among the lowest-cost inference options in the market at launch. Supported models included the Meta Llama 3.3 and Llama 4 series, Alibaba Qwen3, and DeepSeek models, among others. Pricing was structured around token consumption, with costs starting as low as $0.02 per million tokens for smaller models and reaching up to roughly $0.90 per million tokens for larger ones.
By early 2026, Lambda began transitioning away from the standalone Inference API product, directing customers toward deploying models on Lambda GPU instances instead. The inference endpoint product served as an entry point for developers before they scaled onto dedicated or cluster compute. Internally, Lambda's leadership described the decision as a focusing move: the company's core competitive advantage lies in dedicated GPU access, and the inference API's per-token economics required margins that were difficult to defend against together.ai, Fireworks AI, and other inference specialists.
Lambda Stack is a curated set of deep learning software packages that Lambda maintains and tests for compatibility across its hardware. The stack is pre-installed on all Lambda cloud instances and on-premises systems. It includes:
Lambda tests each component for interoperability across its GPU catalog, including the latest HGX B200 and H200 SXM configurations. Lambda Stack can also be installed on non-Lambda Ubuntu systems via a publicly available install script, making it usable by researchers who own their own GPU hardware. Updates are delivered through standard Ubuntu package management.
The following table summarizes the on-demand GPU instance types Lambda offered as of early 2026, based on pricing data from lambda.ai/pricing.
| GPU | VRAM | Price (on-demand) | Notes |
|---|---|---|---|
| NVIDIA B200 SXM6 | 180 GB | $6.69 to $6.99/hr | Latest Blackwell architecture |
| NVIDIA H100 SXM | 80 GB | $3.99 to $4.29/hr | Hopper generation, SXM form factor |
| NVIDIA H100 PCIe | 80 GB | $3.29/hr | PCIe form factor |
| NVIDIA GH200 | 96 GB | $2.29/hr | Grace Hopper Superchip |
| NVIDIA A100 SXM | 40 to 80 GB | $1.99 to $2.79/hr | Ampere generation |
| NVIDIA A100 PCIe | 40 GB | $1.99/hr | Ampere, PCIe |
| NVIDIA A6000 | 48 GB | $1.09/hr | Professional GPU |
| NVIDIA A10 | 24 GB | $1.29/hr | Entry-level data center |
| NVIDIA Tesla V100 | 16 GB | $0.79/hr | Volta generation |
| NVIDIA Quadro RTX 6000 | 24 GB | $0.69/hr | Turing architecture |
1-Click Cluster pricing (per GPU per hour, as of early 2026) is as follows:
| GPU type | 16 GPUs | 64 GPUs | 256+ GPUs |
|---|---|---|---|
| HGX B200 | $9.86/hr | $9.36/hr | $8.87/hr |
| HGX H100 | $6.16/hr | $5.85/hr | $5.54/hr |
Reserved commitments of one year or more are available at custom pricing through Lambda's sales team. Supercluster bare-metal pricing for GB300 NVL72 and Vera Rubin NVL72 systems is custom and contracted directly with Lambda's enterprise team. Industry estimates put Supercluster rates in the low double-digit dollars per GPU-hour for reserved multi-year contracts, although Lambda has not published a public rate card.
Lambda's data center footprint has grown rapidly since 2024. As of mid-2026, Lambda either operated or contracted capacity in the following metropolitan regions:
| Region | Status | Notes |
|---|---|---|
| San Francisco Bay Area, California | Live | Headquarters region and original cloud capacity |
| Mountain View, California (ECL) | Live | First hydrogen-powered GB300 NVL72 systems |
| Dallas / Fort Worth, Texas | Live and expanding | High-density Blackwell clusters |
| Columbus, Ohio | Live | Reserved capacity contracts with partners |
| Chicago, Illinois | Live | Multi-tenant cluster capacity |
| Atlanta, Georgia | Live and expanding | Edge-of-network deployments |
| Kansas City, Missouri | Launching early 2026 | 24 MW launch capacity, scalable past 100 MW |
Lambda uses a mix of self-developed sites and partnerships with data center operators such as Aligned, Cologix, ECL, and EdgeConneX. The company has publicly committed to scaling toward roughly 3 gigawatts of contracted power capacity by 2030, with gigawatt-scale individual sites planned for the late 2020s. The Microsoft agreement and the NVIDIA leaseback together account for a substantial share of forward-contracted capacity, although Lambda has not disclosed exact allocations.
Lambda has emphasized direct-to-chip liquid cooling for its Blackwell generation deployments. The Supermicro-built GB300 NVL72 racks at Lambda sites typically pull about 142 kilowatts each, which is well above the densities supported by traditional air-cooled facilities. The company has also positioned itself as an early adopter of advanced networking, including NVIDIA Quantum-2 InfiniBand for the H100 and B200 generation and Quantum-X Photonics for upcoming Vera Rubin deployments.
Lambda competes in the GPU cloud and AI infrastructure market against hyperscalers (AWS, Google Cloud Platform, Microsoft Azure), other GPU-native cloud providers, and on-premises GPU server vendors. Independent rating systems such as SemiAnalysis's ClusterMAX placed Lambda in the upper tier of GPU-native providers but consistently behind CoreWeave, which has been rated the only "Platinum" tier provider in several 2025 and 2026 assessments.
The table below compares Lambda with three major GPU-native cloud competitors as of early 2026.
| Attribute | Lambda | CoreWeave | RunPod | Crusoe |
|---|---|---|---|---|
| Founded | 2012 | 2017 | 2022 | 2018 |
| GPU ownership model | Owns GPUs | Owns GPUs | Mix of owned and marketplace | Owns GPUs |
| Primary GPU offerings | B200, H100, A100 | H100, H200, A100 | H100, A100, consumer GPUs | H100, A100 |
| H100 on-demand price (approx.) | ~$3.99 to $4.29/hr | ~$4.76/hr | ~$1.99 to $2.49/hr | ~$1.71/hr |
| Multi-node clusters | Yes (1-Click Clusters) | Yes (enterprise focus) | Limited | Limited |
| InfiniBand networking | Yes (Quantum-2) | Yes | No | No |
| Serverless inference API | Winding down (2026) | No | Yes (serverless GPU) | No |
| SOC 2 Type II | Yes | Yes | No | Yes |
| Egress fees | None | None (intra-network) | Yes | No |
| Public listing | Private (IPO H2 2026) | Public (NASDAQ, since 2025) | Private | Private |
| Primary differentiator | No-egress pricing, ease of use, NVIDIA partnership | Enterprise SLAs, Kubernetes-native | Low cost, spot instances | Sustainable energy sourcing |
CoreWeave is Lambda's closest direct competitor at scale. CoreWeave went public in March 2025 at a valuation approaching $65 billion and has aggressively secured NVIDIA GPU supply through similar leaseback and partnership arrangements. CoreWeave targets enterprise customers and hyperscalers with strong SLA guarantees and Kubernetes-native orchestration but carries higher list prices than Lambda for comparable GPU configurations. CoreWeave is the largest single neocloud by revenue and has been estimated to hold roughly 18 percent of the dedicated AI training and high-performance computing GPU market.
RunPod competes primarily on price, offering spot instances and a marketplace model that can produce significantly lower per-hour costs for tolerant workloads. RunPod's flexibility (spot pricing, consumer-grade GPUs, serverless functions) attracts cost-sensitive developers and smaller teams, but the platform offers fewer enterprise features and no InfiniBand-class multi-node networking.
Crusoe positions itself around sustainable infrastructure, using stranded natural gas and other waste energy sources to power GPU clusters. Crusoe's pricing is among the lowest in the market for H100 access, at around $1.71 per GPU-hour, but its geographic footprint and cluster scale are smaller than Lambda's.
The hyperscaler comparison is more complex: AWS, Google, and Azure all offer GPU compute, but with higher list prices, more complex pricing structures, and egress fees that can significantly increase total cost for large data-transfer workloads. Lambda's flat pricing with no egress fees is a frequently cited reason developers choose the platform over cloud giants for training and fine-tuning workloads. The Microsoft agreement signed in November 2025 partially blurs this distinction: Microsoft is simultaneously a competitor (via Azure's own GPU instances) and a wholesale customer of Lambda's capacity, an arrangement that mirrors CoreWeave's relationship with the same hyperscaler.
The major cloud providers all offer GPU compute, but they package it differently from neoclouds like Lambda. Hyperscalers bundle compute with proprietary managed services (storage, networking, identity, machine learning platforms) and charge premium prices for GPU instances. They also typically include egress fees, which can add substantial cost to data-intensive AI training pipelines.
| Attribute | Lambda | AWS | Google Cloud | Microsoft Azure |
|---|---|---|---|---|
| H100 on-demand price (approx.) | ~$3.99 to $4.29/hr | ~$12.29/hr (p5) | ~$11.06/hr (a3-highgpu-8g) | ~$98.32 per 8-GPU/hr (ND H100 v5) |
| Egress fees | None | Yes (tiered) | Yes (tiered) | Yes (tiered) |
| Multi-node InfiniBand | Yes | Yes (EFA, not InfiniBand) | Yes (TPU-specific) | Yes |
| Reserved-vs-on-demand pricing spread | Modest | Large | Large | Large |
| Primary distinction | GPU-only specialization | Managed services breadth | TPU access | Azure AI services + GPU |
Hyperscalers retain advantages in geographic coverage, managed services integration (databases, identity, observability), and procurement processes that align with how large enterprises buy cloud. Lambda's pitch to those enterprises is that for the GPU compute layer specifically, a neocloud provides better unit economics, faster provisioning, and access to the latest accelerators sooner.
Lambda's disclosed customer base spans research universities, AI startups, and large enterprises. Publicly confirmed customers or users have included Apple, MIT, Stanford University, Harvard University, Caltech, Kaiser Permanente, Tencent, the U.S. Department of Defense, and Microsoft (under the 2025 infrastructure agreement). Lambda has also stated that its infrastructure has supported workloads from OpenAI, xAI, Anthropic, Amazon, and Google, though the specific nature of those relationships varies.
The September 2025 NVIDIA leaseback deal made NVIDIA Lambda's largest single customer, with NVIDIA using Lambda's infrastructure to run its own GPU-intensive workloads. By late 2025, multiple frontier AI labs had used Lambda clusters for some portion of their training or inference, although none of these labs ran the bulk of their flagship training runs on Lambda. Most frontier labs rely on a portfolio of providers, with Lambda providing surge or experiment capacity rather than primary production capacity.
Common use cases for Lambda's cloud include:
Lambda's Lambda Stack has been adopted by more than 50,000 machine learning teams globally, including research groups that install it on their own hardware rather than using Lambda Cloud.
Lambda runs an asset-heavy business model: it raises debt and equity, purchases NVIDIA accelerators in volume, signs long-term power and colocation contracts, and rents the resulting capacity to AI customers. The economics of the model resemble a data center operator more than a software company. Several characteristics define Lambda's economic profile.
Sacra Research estimated $425 million in 2024 revenue, growing to a $760 million annualized run rate by Q3 2025, up 79 percent year over year, and roughly $500 million annualized in May 2025 as an interim data point. Sacra valued Lambda at approximately 5.9 times trailing revenue in October 2025, compared to CoreWeave's 23.4 times multiple following its IPO, suggesting the private market assessed Lambda at a significant discount to its publicly traded competitor despite similar GPU access and business models.
Lambda has received broadly positive reviews from the developer community for pricing transparency, the absence of egress fees, and the quality of the Lambda Stack pre-configuration. GPUCloudList awarded the platform 8.5 out of 10 in a 2026 review, citing competitive H100 pricing and "zero setup friction" as major strengths. SemiAnalysis's ClusterMAX 2.0 rating placed Lambda in the Silver tier as of late 2025, behind CoreWeave (Platinum) and Crusoe and Fluidstack (Gold) but ahead of many smaller specialists.
The primary recurring criticism in 2024 and into 2025 was GPU availability. Users documented frequent "temporarily unavailable" status for popular instance types, with some multi-GPU H100 configurations impossible to provision on short notice during peak demand periods. One developer writing on Medium described a 26-hour wait for a four-GPU H100 configuration after successfully using two GPUs the previous day. Lambda addressed availability issues partly through the expansion of its cluster inventory and longer-commitment reservation products, which improved reported availability for dedicated cluster customers even as on-demand availability remained unpredictable.
A secondary criticism involved performance consistency: some users reported that long-running jobs encountered unexpected slowdowns requiring active monitoring and checkpoint management. Lambda's response has been to emphasize its 1-Click Clusters and dedicated reservation paths, where single-tenant hardware reduces the variability associated with shared on-demand pools.
On the business side, Lambda's revenue growth attracted analyst attention. The 79 percent year-over-year run-rate growth reported in late 2025 outpaced most large cloud providers, although Lambda's absolute revenue remained much smaller than CoreWeave's. Analysts at PM Insights and Forge Global noted that secondary share prices for Lambda climbed in the months leading up to and following the Series E, reflecting investor enthusiasm for the IPO narrative.
Several structural limitations were noted by analysts and users as of 2025 and early 2026:
GPU availability on on-demand instances remains constrained during periods of peak demand. Unlike hyperscalers with vast reserved capacity across dozens of regions, Lambda's footprint is smaller, and popular configurations can sell out quickly. Teams with hard deadlines or time-sensitive training runs often prefer reserved cluster contracts over on-demand access.
Lambda does not offer spot instances (preemptible compute) as of early 2026, a feature that RunPod and AWS both provide and that can reduce costs significantly for fault-tolerant workloads.
Geographic diversity is limited compared to hyperscalers. Lambda operates data centers in the United States and, as of early 2026, had not published a multi-region deployment map comparable to AWS or Google Cloud. This limits latency optimization for users outside North America and creates data residency constraints for international enterprise customers.
Lambda's heavy reliance on NVIDIA creates supply-chain risk. If NVIDIA were to reduce allocations, raise prices, or change partnership terms, Lambda's cost structure and GPU availability would be directly affected. CoreWeave faces the same dependency, but with a larger and more diversified inventory. Lambda has not publicly announced support for accelerators from AMD, Intel, or other alternative chip vendors as of mid-2026, although the company has stated that it evaluates the broader silicon landscape.
Customer concentration is also a risk. The September 2025 NVIDIA leaseback agreement made NVIDIA Lambda's largest single customer, and the November 2025 Microsoft deal added another large concentrated relationship. While these contracts provide revenue visibility, they expose Lambda to risk if either counterparty alters scope or timing.
The wind-down of the standalone Inference API product in early 2026 removed a low-friction entry point for developers who wanted to pay only for tokens rather than manage full GPU instances. While Lambda directs these users to its instance marketplace, the shift increases the minimum cost threshold for small-scale inference workloads.
Finally, Lambda's economics depend on continued capital availability for asset-heavy data center buildouts. If the broader market reassesses neocloud unit economics, valuations and debt terms could tighten quickly, as happened to several smaller GPU clouds during 2024.
Lambda's near-term outlook is shaped by three converging dynamics. First, the Series E financing, the Microsoft contract, the NVIDIA leaseback, and the planned Kansas City AI factory all suggest the company is moving from a developer-focused on-demand GPU cloud into a wholesale infrastructure operator that supplies a small number of very large customers alongside its long tail of researchers. Second, Lambda's stated goal of roughly 3 gigawatts of contracted power by 2030 implies a multi-year capital expenditure program comparable in scope to mid-sized data center REITs. Third, the leadership reorganization in May 2026, which brought in operating executives from telecommunications and large-scale technology, signals that Lambda's board is preparing the company for public-market scrutiny and operational scale.
The public market reception of CoreWeave's 2025 IPO, alongside subsequent volatility in neocloud valuations, will likely shape Lambda's IPO timing and pricing. Analysts at Forge Global and PM Insights have argued that a successful Lambda IPO could push the company's market capitalization well above the Series E valuation of $5.9 billion, although the comparison is sensitive to whether public investors continue to value GPU clouds on revenue multiples close to CoreWeave's.
Longer term, Lambda's strategic position depends on whether AI compute demand continues to outpace supply, whether NVIDIA maintains its dominance over the accelerator market, and whether power availability remains the binding constraint on data center growth. If any of these dynamics shift, Lambda will have to adapt its asset-heavy model to changes in either the chip stack or the energy stack underneath it.