Lightning AI
Last reviewed
May 25, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v1 ยท 3,369 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 25, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v1 ยท 3,369 words
Add missing citations, update stale details, or suggest a clearer explanation.
Lightning AI is a New York based artificial intelligence platform company founded by William Falcon, the creator of the pytorch lightning deep learning framework.[1][2] The company began in 2019 as Grid.ai and rebranded to Lightning AI in June 2022 alongside a $40 million Series B round led by coatue (Coatue Management).[3] Lightning AI develops a commercial cloud platform, Lightning Studios, plus a suite of open source libraries that includes PyTorch Lightning, Lightning Fabric, LitGPT, LitServe, and LitData.[4][5] The company is distinct from the PyTorch Lightning framework itself: Lightning AI sells managed cloud compute and enterprise tooling, while PyTorch Lightning is the open source training framework that Falcon released in 2019 during his PhD at New York University.[6][7]
By late 2024 the company reported more than 230,000 individual developer users across roughly 3,200 organizations, with $103 million in total venture funding.[8] In January 2026 Lightning AI completed a merger with GPU infrastructure provider Voltage Park, creating a combined entity valued above $2.5 billion that operates a fleet of more than 35,000 owned and operated H100, B200, and GB300 accelerators.[9][10]
William Falcon and Luis Capelo incorporated Grid.ai in 2019.[11] Falcon was midway through a PhD at NYU at the time, and Capelo had previously led machine learning at the cosmetics company Glossier and data products at Forbes.[12][11] The first product was a cloud training service that let researchers run PyTorch Lightning jobs across hundreds of GPUs or TPUs without writing infrastructure code, positioning Grid as the commercial complement to the PyTorch Lightning framework that Falcon had open sourced in July 2019.[13][7]
In October 2020 Grid emerged from stealth with an $18.6 million Series A led by index ventures (Index Ventures), with participation from Bain Capital Ventures and firstminute capital.[14] At the time the company had roughly 25 employees and was operating in private beta.[14] Bryan Offutt of Index Ventures joined the board.[3]
On 16 June 2022 Grid.ai announced that it had rebranded as Lightning AI and raised a $40 million Series B led by Coatue, with participation from Index Ventures, Bain Capital Ventures, the Chainsmokers backed Mantis VC, and First Minute Capital.[3] Coatue General Partner Caryn Marooney joined the board.[3] The rebrand reflected a broadening of scope: rather than only selling cloud training jobs, the company began describing its product as an "operating system for AI" that would unify development, training, and deployment.[15] By that point PyTorch Lightning had been downloaded around 20 million times and was reportedly used in production by around 10,000 companies.[3] The company had grown to roughly 60 employees.[3]
The rebrand also coincided with the public launch of the open source Lightning App framework, which let developers stitch together multi-step AI pipelines on the Lightning cloud platform.[15]
On 13 December 2023 Lightning AI announced the general availability of Lightning AI Studios, an enterprise grade cloud development environment.[16] The company described Studios as the culmination of three years of work on what it called a next generation paradigm for building AI products.[16] Studios provided persistent virtual machines with attached cloud GPUs that users could access through a browser based VS Code style interface or a local connection.[17] Falcon compared the product approach to the iPhone in that it bundled previously separate developer tools (notebooks, training, fine tuning, serving, dataset preparation) into a single environment.[18]
In February 2024 Lightning AI signed a strategic collaboration agreement with Amazon Web Services, and in May 2024 the Studio platform launched on AWS Marketplace.[19][20]
On 21 November 2024 Lightning AI announced a $50 million round from Cisco Investments, J.P. Morgan, K5 Global, and NVIDIA, bringing total funding to roughly $103 million.[8] At the time the company reported 240,000 individual users across about 2,000 organizations and stated that PyTorch Lightning had passed 160 million cumulative downloads.[8] In a separate filing summary, the company said it had 230,000 developers and 3,200 organizations using the platform.[21] Falcon told TechCrunch that the firm was on track to reach $10 million to $20 million in annual recurring revenue by the end of 2025.[21]
On 21 January 2026 Lightning AI and Voltage Park, a large scale GPU as a service provider, announced the completion of a merger under the Lightning AI brand.[9] The combined entity was reported at a valuation above $2.5 billion with annual recurring revenue surpassing $500 million.[10][22] The merged company operates a fleet of more than 35,000 owned and operated H100, B200, and GB300 GPUs across multiple data centres, accessible through the Lightning software stack.[9] Voltage Park itself had acquired the GPU marketplace TensorDock in April 2025, and that capacity was folded into the combined fleet.[23]
The combined company reported a user base of more than 400,000 individual developers, with existing contracts and deployments preserved under the merger.[9]
William Falcon is the founder and chief executive of Lightning AI. He was born in Venezuela, emigrated to the United States as a child, served in the United States Navy where he undertook SEAL training, and then studied at Columbia University.[12][11] He graduated from Columbia magna cum laude in May 2018 with a B.A. in computer science and statistics and a mathematics minor.[24] As a Columbia undergraduate he worked in the Center for Theoretical Neuroscience under Liam Paninski on neural decoding from the brain and retina.[24][12] He also co founded the financial literacy startup NextGenVest, which was acquired by Commonbond in December 2018.[25]
Falcon began doctoral studies at the NYU Center for Data Science in September 2018, focusing on deep learning and self supervised learning under the advisement of yann lecun and Kyunghyun Cho at the NYU CILVR Lab (Computational Intelligence, Learning, Vision, and Robotics).[24][26] His PhD was funded by Google DeepMind and the National Science Foundation.[24]
Falcon began work on what became PyTorch Lightning in 2015 while at Columbia, then open sourced the project in July 2019 during his joint affiliation with NYU and Facebook AI Research (later Meta AI).[24][11] He has cited the desire to remove engineering boilerplate from research as the core motivation for the framework.[27] After founding Grid.ai in 2019 with Luis Capelo and securing an $18.6 million Series A in 2020, Falcon assumed the chief executive role that he continues to hold under the Lightning AI brand.[11][14]
Lightning Studios is the company's flagship commercial cloud product. Each Studio is a persistent virtual machine with attached GPU resources that a user can open through a web browser or attach to a local VS Code session.[17] The product is designed so that environment state (installed packages, datasets, model checkpoints) persists across sessions, and users can swap the attached GPU type without losing work, including hot swapping from a CPU only instance up to multi GPU configurations.[17] Studios also exposes specialised "apps" for multi node training, distributed data preparation, model serving, and hosting AI web applications.[16] The platform launched in December 2023 after the company described three years of development on the underlying environment.[16] A free tier is available without a credit card; paid tiers add larger GPUs and longer running jobs.[17]
Studios are also used as a distribution mechanism for templates: a user can publish a Studio with a preconfigured environment, dataset, and notebook that other users can clone in one click and run on their own attached GPU.[17] This template model is a primary mechanism through which the company distributes reproductions of open source models such as Llama and Mistral, since the same Studio template can be opened and modified by any user without local installation.[4][17] In February 2024 Lightning AI signed a strategic collaboration agreement with AWS, and in May 2024 the Studio platform was added to AWS Marketplace, allowing enterprise customers to draw on AWS commit credits to pay for Lightning usage.[19][20]
PyTorch Lightning is the open source training framework that gives the company its name. It provides a high level Trainer abstraction over pytorch that handles distributed strategies including fsdp (Fully Sharded Data Parallel), deepspeed, and mixed precision, plus checkpointing, logging, and callbacks.[28][7] Users subclass a LightningModule to define training, validation, and test steps, and the Trainer object then takes responsibility for the distributed, hardware, and precision logic.[7] Falcon released the framework in July 2019 and continues to lead its development.[7][27] At the time of the 2024 funding announcement, cumulative downloads exceeded 160 million, and at the 2022 Series B the figure was around 20 million, indicating roughly an order of magnitude growth in the intervening two years.[8][3] The detailed history and design of the framework is covered in a separate article (pytorch lightning).
Lightning Fabric is a lighter weight library that exposes the same distributed training primitives used inside the PyTorch Lightning Trainer (DDP, fsdp, deepspeed, mixed precision) as standalone functions that can be added to existing PyTorch code with minimal modification.[29] The official documentation positions Fabric as the path for users who already have a working training loop and want to scale it without adopting the full Trainer abstraction.[29] Conversion typically requires changing five lines of code.[29] Fabric ships in the same Python package as PyTorch Lightning.[28]
LitGPT is an Apache 2.0 licensed collection of from scratch implementations of more than 20 open large language models with reference recipes for pretraining, fine tuning, and deployment.[4] Supported families include llama (Llama 3, 3.1, 3.2, and 3.3), mistral 7b and Mixtral, gemma, Microsoft's phi, Alibaba's Qwen, Code Llama, and TinyLlama.[4] The repository emphasises single file, no abstraction implementations powered by Lightning Fabric so that users can read and modify the model code directly.[4] LitGPT is the successor to Lit LLaMA, an earlier 2023 reimplementation of Meta's LLaMA based on Andrej Karpathy's nanoGPT codebase.[30]
LitServe is an Apache 2.0 licensed Python serving framework for arbitrary AI models, released in 2024.[31] It builds on FastAPI but adds AI specific features including dynamic batching, GPU autoscaling, streaming, and asynchronous concurrency, and the project claims a roughly 2x speedup over vanilla FastAPI for typical inference workloads.[31] LitServe is designed to support any model type (LLMs, vision, audio, classical machine learning, and RAG pipelines) and can be self hosted or deployed through Lightning Studios with managed autoscaling.[31] OpenAI compatible endpoints are provided for chat completion style services so that existing OpenAI client code can be redirected to a LitServe endpoint without modification.[31]
The serving model is request driven but multi worker by default: each worker process can hold a model resident on a GPU, and a router fans incoming requests out to workers with optional micro batching to amortise GPU launch overhead.[31] The framework deliberately exposes the inference loop as a Python class rather than hiding it behind a configuration file, so that users retain control over preprocessing, postprocessing, and any custom routing logic.[31] As of December 2025 the public repository had passed 41 releases, with v0.2.17 published on 23 December 2025.[31]
LitData is an Apache 2.0 licensed library for accelerating data loading in distributed training. It transforms raw datasets into an optimised streaming binary format that can be read from S3, Google Cloud Storage, or Azure Blob, and provides a StreamingDataset class that distributes shards correctly across ranks for multi GPU and multi node training.[32] The official documentation claims throughput improvements of up to 20x for cloud streamed datasets compared with naive object store reads, mainly through binary chunking and prefetching that hides storage latency.[32] The project also provides a distributed map operator for preprocessing tasks such as image resizing, embedding generation, and web scraping across a fleet of machines, with the goal of compressing weeks long preprocessing jobs into hours or minutes when scaled across many workers.[32] LitData was first published in 2023 and is integrated into PyTorch Lightning, Lightning Fabric, and pure PyTorch workflows.[32] Pause and resume support allows long preprocessing pipelines to recover from machine failures without restarting from scratch.[32]
Beyond its commercial product, Lightning AI maintains a portfolio of open source libraries and reproductions of widely used models.
Lit LLaMA, released in 2023 under Apache 2.0, was a single file reimplementation of Meta's LLaMA architecture based on nanoGPT.[30] The project supported flash attention, Int8 and 4 bit GPTQ quantisation, lora fine tuning, and LLaMA Adapter style methods.[30] Lit LLaMA is no longer actively maintained; its successor is LitGPT.[30]
LitGPT extends the same single file design philosophy to more than 20 model families. The repository ships reproductions and fine tuning recipes for Llama 3 through 3.3, Mistral, Mixtral, Gemma, Phi, Qwen, Code Llama, and TinyLlama among others, all under Apache 2.0.[4] The implementations are powered by Lightning Fabric rather than the full Trainer abstraction so that the model code remains compact and hackable.[4]
PyTorch Lightning, Lightning Fabric, LitServe, and LitData are all distributed under the Apache 2.0 licence and are maintained on the Lightning AI GitHub organisation.[28][29][31][32] In December 2023 Lightning AI joined the AI Alliance, an industry consortium organised by IBM and Meta to promote open AI development.[33]
Lightning AI has disclosed three priced equity rounds plus the 2026 stock for stock merger with Voltage Park.
| Date | Round | Amount | Lead | Other investors |
|---|---|---|---|---|
| October 2020 | Series A | $18.6M[14] | Index Ventures[14] | Bain Capital Ventures, firstminute capital[14] |
| June 2022 | Series B | $40M[3] | Coatue[3] | Index Ventures, Bain Capital, Mantis VC, First Minute Capital[3] |
| November 2024 | $50M round (described as Series C class)[8] | $50M[8] | (multi investor)[8] | Cisco Investments, J.P. Morgan, K5 Global, NVIDIA[8] |
| January 2026 | Merger with Voltage Park[9] | (stock for stock)[9] | n/a | n/a |
Total disclosed cash funding before the Voltage Park merger was approximately $103 million.[8] The 2026 merger produced a combined enterprise valuation above $2.5 billion and an annual recurring revenue base above $500 million.[10][22]
Board representation has tracked the rounds. Bryan Offutt of Index Ventures joined at Series A in 2020; Caryn Marooney of Coatue joined at Series B in 2022.[3]
Lightning AI competes in two overlapping markets: managed AI development environments and managed inference serving. The most direct comparisons are modal, replicate, and anyscale.
modal (Modal Labs) provides a serverless Python function abstraction in which users decorate functions and Modal handles container packaging, GPU allocation, and scale to zero. The model is request driven rather than session driven, which differs from Lightning Studios where the unit of work is a persistent virtual machine that the user opens, modifies, and closes. Modal targets backend Python services and inference; Lightning targets the full development lifecycle including notebook style iteration on GPUs.[34] For inference specifically, Modal's primary primitive is a function with autoscaling, while Lightning's primary primitive is a LitServe deployment running inside a Studio or as a managed endpoint.[31][34]
replicate focuses on inference for prebuilt open source models. Replicate users typically pick a community published model, send HTTP requests, and pay per inference second, with a Cog tool for packaging custom models. Lightning Studios and LitServe target users who write and fine tune their own models rather than primarily consuming third party ones; LitServe is a serving framework rather than a hosted model marketplace.[31] The published models on Replicate are typically wrapped in Cog containers, while Lightning users typically push code into a Studio template that other users can clone.[17]
anyscale is the commercial company behind Ray and offers managed Ray clusters for distributed training, hyperparameter search, and serving. Anyscale's primary integration point is the Ray API, while Lightning's primary integration points are PyTorch Lightning, Fabric, LitServe, and LitData. Both companies position themselves as full lifecycle platforms, but Lightning is closer to the PyTorch ecosystem and Anyscale is closer to the Ray ecosystem.[35] Lightning's distributed training abstractions wrap native PyTorch features such as FSDP and DeepSpeed; Anyscale's wrap Ray Train and Ray Serve.[29][35]
Lightning also overlaps with huggingface tgi (Text Generation Inference) and the broader huggingface peft toolchain on the model side, although Lightning's products span training, data preparation, and serving rather than focusing on a single stage.[31][4] After the January 2026 merger with Voltage Park, Lightning's offering also overlaps with raw GPU bare metal providers such as CoreWeave and Lambda Labs at the infrastructure layer, since the combined company now owns and operates GPU capacity in addition to selling software.[9][23]