Modal (platform)

AI Companies AI Infrastructure Cloud Computing

17 min read

Updated Apr 26, 2026

Modal is a serverless cloud computing platform designed for running compute-intensive applications in artificial intelligence, machine learning, and data processing. The platform allows developers to write standard Python code and execute it in the cloud with automatic containerization, scaling, and GPU provisioning, eliminating the need to manage servers, Kubernetes, or Docker configurations. Modal is developed by Modal Labs, Inc., headquartered in New York City with offices in San Francisco and Stockholm.

Founded in January 2021 by Erik Bernhardsson and Akshat Bubna, Modal has raised over $110 million in venture capital funding and reached a $1.1 billion valuation (unicorn status) with its September 2025 Series B round. As of February 2026, the company was reported to be in talks for a new funding round at a valuation of approximately $2.5 billion. The platform serves thousands of customers including Meta, Scale, Substack, Ramp, and Lovable, and generates approximately $50 million in annualized revenue.

History and Founding

Modal was founded in January 2021 by Erik Bernhardsson, who serves as CEO. Akshat Bubna joined as co-founder and CTO in August 2021. The company is legally incorporated as Modal Labs, Inc.

Erik Bernhardsson

Bernhardsson is a Swedish software engineer who spent seven years at Spotify (2008 to 2015), where he built the core of the music recommendation system and led a team of 20 engineers. During his time at Spotify, he created two widely used open-source projects: Luigi, a Python workflow scheduler for building complex pipelines of batch jobs (over 17,000 stars on GitHub), and Annoy (Approximate Nearest Neighbors Oh Yeah), a C++/Python library for fast approximate nearest neighbor search in high-dimensional spaces. Annoy remains in use at Spotify to power billions of music recommendations.

After Spotify, Bernhardsson served as CTO of Better.com from 2015 to 2020, where he scaled the engineering team from 1 to 300 people and built AI/ML systems for mortgage processing. He began developing Modal during the COVID-19 pandemic, motivated by his repeated frustration with the gap between local development and cloud execution in data and ML workflows.

Akshat Bubna

Bubna won a gold medal at the International Olympiad in Informatics (IOI) in 2014, becoming the first Indian student to achieve this distinction. Before joining Modal, he worked as an early engineer at Scale AI. Bubna leads Modal's engineering team, which includes multiple International Olympiad in Informatics gold medalists.

Early Development

Bernhardsson's core insight was that data and ML teams have fundamentally different infrastructure needs from traditional backend engineering teams. They require frequent infrastructure changes, bursty compute demands, and access to specialized hardware like GPUs and high-memory systems. Existing tools forced a slow cycle of building containers, pushing them to a registry, triggering jobs, and downloading logs, which destroyed the fast feedback loop that productive development requires.

Rather than building on top of existing systems like Docker and Kubernetes, Bernhardsson decided to build Modal's infrastructure from scratch. The first two years of development focused on foundational components: a custom container runtime, a proprietary filesystem, a scheduler, and an image builder, all written in Rust for performance. The company operated in a closed beta during this period.

Modal's first significant traction came when Stable Diffusion launched in August 2022, as developers needed a fast way to run GPU-intensive image generation workloads without managing infrastructure. This validated the platform's serverless GPU model and attracted its initial wave of users.

Funding and Valuation

Modal has raised over $110 million in total venture capital funding across three rounds, reaching unicorn status in September 2025.

Round	Date	Amount	Lead Investor	Total Raised	Valuation
Seed	Early 2022	$7 million	Amplify Partners	$7 million	Not disclosed
Series A	October 2023	$16 million	Redpoint Ventures	$23 million	~$154 million (post-money)
Series B	September 2025	$87 million	Lux Capital	~$111 million	$1.1 billion (post-money)

Other investors across these rounds include Definition Capital, Creandum, Essence VC, and angel investors Elad Gil and Neha Narkhede. The Series A coincided with Modal's general availability launch in October 2023.

In February 2026, TechCrunch reported that Modal was in talks with General Catalyst to lead a new funding round at a valuation of approximately $2.5 billion, which would more than double the company's valuation from less than five months prior. CEO Erik Bernhardsson characterized the discussions as general conversations rather than active fundraising. At the time, Modal's annualized revenue run rate was reported to be approximately $50 million.

The company had approximately 14 employees at the time of its Series A in late 2023, growing to around 79 employees by 2025.

Platform and Technology

Core Architecture

Modal's platform is built on custom infrastructure written primarily in Rust, designed from the ground up to minimize container startup times and maximize GPU utilization. The architecture differs substantially from standard container orchestration platforms in several ways.

Custom Container Runtime: Modal uses Google's gVisor as its container runtime instead of standard Docker containers. gVisor is a sandboxed container runtime that intercepts application system calls and acts as a guest kernel, providing stronger isolation than traditional containers that share the host system's kernel. Modal was the first company to run gVisor with GPUs at scale; at the time, GPU support in gVisor was highly experimental, and Modal's engineering team has contributed upstream improvements to make gVisor GPU support production-ready.

FUSE-Based Filesystem: Modal built a custom filesystem using FUSE (Filesystem in Userspace) with content-addressed storage. Files are hashed and stored by their hash value, which eliminates duplication across container images. When a container starts, it does not need to pull an entire image; instead, it lazily fetches only the files that are actually accessed. Modal's data shows that applications typically access only a small fraction of their container image contents. An in-memory index tracks file hashes, and an aggressive local SSD cache (with approximately 100 microsecond latency) stores frequently used files, compared to approximately 2 millisecond latency for network storage. The filesystem was initially prototyped in Python, then rewritten in Rust for production performance.

VolumeFS: A second custom filesystem called VolumeFS provides persistent storage for Modal functions across any region. It uses eventual consistency rather than strict immediate consistency, which enables scalable distributed training and large dataset operations without sacrificing throughput.

Automated Resource Solver: Modal implements an automated GPU resource allocation system using Google's OR-Tools Linear Optimization Package (GLOP). This solver runs every few seconds, analyzing all active workloads and available GPU capacity across AWS, Google Cloud Platform, and Oracle Cloud Infrastructure. It factors in regional constraints, pricing, and capacity to perform both comprehensive and incremental optimization passes, allowing the platform to elastically scale to hundreds of GPUs without users managing any cluster infrastructure.

Container Performance

Modal's stated goal is sub-second container startup. In practice, cold start times for GPU-enabled containers typically range from 1 to 4 seconds, which is significantly faster than the minutes-long startup times common with Docker-based deployments. The platform achieves this through several techniques:

Bypassing Docker entirely, pointing gVisor directly at a root filesystem with a JSON configuration rather than pulling container images
Content-addressed caching that avoids re-downloading files already present on the host
Lazy file loading that fetches only the small fraction of image files actually needed at runtime
CRIU (Checkpoint/Restore In Userspace) memory snapshotting, which captures the state of a running container and restores it on different machines, reducing initialization time for containers with heavy dependencies like PyTorch

Modal reports that its infrastructure can build a 100 GB container image and then boot up 100 instances of that container within seconds.

Python SDK and Developer Experience

Modal's primary interface is a Python SDK that uses decorators to define cloud infrastructure. Developers annotate their Python functions with Modal decorators, and the platform handles containerization, environment setup, and remote execution automatically. There are no YAML files, Dockerfiles, or Kubernetes manifests required.

A basic Modal function looks like this:

import modal

app = modal.App("example")

@app.function()
def square(x):
    return x ** 2

Environment configuration is expressed in Python rather than configuration files:

@app.function(
    image=modal.Image.debian_slim().pip_install(["numpy", "pandas"]),
    gpu="A100",
    timeout=300,
)
def train_model(data):
    # This runs on an A100 GPU in the cloud
    ...

Key SDK features include:

Feature	Description
`@app.function()`	Converts a Python function to a cloud-executed function with configurable resources
`image=`	Defines the container environment using Python code (pip packages, apt packages, custom Docker images)
`gpu=`	Specifies GPU type and count (e.g., `"A100"`, `"H100:2"`, `modal.gpu.A100(count=4)`)
`schedule=`	Sets up cron-like scheduled execution (e.g., `modal.Period(hours=1)`)
`.map()`	Distributes work across multiple containers in parallel
`@app.cls()`	Defines a class with lifecycle hooks and shared state across function calls
`@modal.web_endpoint()`	Deploys a function as an HTTPS endpoint
`modal.Volume()`	Attaches persistent storage to functions
`modal.Secret()`	Securely manages and injects environment variables and credentials

As of late 2025, Modal has also released alpha SDKs for JavaScript/TypeScript and Go, signaling plans for broader language support beyond Python.

Products and Features

Inference

Modal enables developers to deploy and scale LLM inference, audio models, image generation models, and other AI models as serverless endpoints. The platform provides instant autoscaling from zero to hundreds of GPUs and back down to zero, with per-second billing. Users bring their own models and frameworks (such as vLLM, Hugging Face Transformers, or custom serving code), and Modal handles the underlying infrastructure.

Training and Fine-Tuning

The platform supports model training and fine-tuning on single-node or multi-node GPU clusters. Users can provision A100, H100, or H200 GPUs instantly without reservation or quota management. Modal's per-second billing means users pay only for the time their training job is actively running.

Sandboxes

Modal Sandboxes provide isolated execution environments for running untrusted code from LLMs or users. Built on gVisor's security isolation, sandboxes allow developers to programmatically spin up ephemeral containers with controlled resource limits and filesystem access. Meta has used Modal Sandboxes to spin up thousands of concurrent sandboxed environments for reinforcement learning workflows.

Batch Processing

Modal supports large-scale batch processing workloads that can scale to thousands of containers on demand. The .map() and .starmap() operations distribute work across containers in parallel, with automatic retry logic and error handling. Common use cases include data preprocessing, web scraping at scale, and ETL pipelines.

Web Endpoints

Any Modal function can be deployed as a serverless HTTPS endpoint with a single decorator. These endpoints support standard HTTP methods and can serve as REST APIs, webhook handlers, or frontend-facing services. Modal handles TLS termination, load balancing, and autoscaling for these endpoints.

Scheduled Functions

Modal supports cron-like scheduling through the schedule= parameter in function decorators. Scheduled functions can run at fixed intervals or on custom cron expressions, enabling periodic data processing, model retraining, or monitoring tasks.

Notebooks

Modal offers real-time collaborative notebooks that run on cloud GPUs. These notebooks support sharing and provide an interactive development environment for data exploration, prototyping, and experimentation.

Volumes and Storage

Modal provides two storage primitives:

Modal Volumes: Persistent, distributed storage that can be attached to any Modal function across regions. Built on VolumeFS, volumes support high-throughput reads and writes for model weights, training data, and output artifacts.
Modal NetworkFileSystem: NFS-compatible storage for workloads that require POSIX filesystem semantics.

GPU Availability

Modal provides access to a range of NVIDIA GPUs through its multi-cloud infrastructure spanning AWS, Google Cloud Platform, and Oracle Cloud Infrastructure.

GPU	Memory	Approximate Hourly Cost
NVIDIA T4	16 GB	$0.59
NVIDIA L4	24 GB	$0.80
NVIDIA A10	24 GB	$1.10
NVIDIA L40S	48 GB	$1.95
NVIDIA A100 40 GB	40 GB	$2.10
NVIDIA A100 80 GB	80 GB	$2.50
NVIDIA RTX PRO 6000	48 GB	$3.03
NVIDIA H100	80 GB	$3.95
NVIDIA H200	141 GB	$4.54
NVIDIA B200	192 GB	$6.25

All GPU pricing is billed per second with no minimum commitment. Users can request multiple GPUs per function (e.g., 4x H100 or 8x A100) for multi-GPU training or inference workloads.

Pricing

Modal uses pure usage-based pricing with per-second billing. There are no reserved instances or long-term commitments.

Compute Pricing

Resource	Price
CPU	$0.0000131 per physical core per second
Memory	$0.00000222 per GiB per second
Sandbox/Notebook CPU	$0.00003942 per core per second
Sandbox/Notebook Memory	$0.00000672 per GiB per second

GPU pricing varies by GPU type (see GPU Availability table above).

Plan Tiers

Feature	Starter (Free)	Team ($250/month)	Enterprise (Custom)
Monthly compute credits	$30	$100	Custom
Workspace seats	3	Unlimited	Unlimited
Maximum containers	100	1,000	Custom
GPU concurrency	10	50	Custom
Scheduled functions (crons)	5	Unlimited	Unlimited
Web endpoints	8	Unlimited	Unlimited
Log retention	1 day	30 days	Custom

Credit Programs

Modal offers compute credit programs for qualifying organizations:

Program	Credits Available
Startup credits	Up to $25,000
Academic credits	Up to $10,000

Multi-Cloud Infrastructure

Modal operates across multiple cloud providers, including AWS, Google Cloud Platform, and Oracle Cloud Infrastructure. In September 2024, Modal announced that it had selected Oracle Cloud Infrastructure (OCI) as its primary cloud infrastructure provider. By leveraging OCI's bare metal compute instances, Modal reported improvements in performance for inference, fine-tuning, and batch processing workloads.

The platform's automated resource solver handles multi-cloud scheduling, selecting the most cost-effective GPU capacity across providers based on availability, regional constraints, and pricing. Customers interact only with Modal's unified interface and do not need to manage cloud provider accounts or configurations.

Modal is also available through the AWS Marketplace, allowing enterprise customers to apply existing AWS committed spend toward Modal usage.

Security and Compliance

Modal has achieved SOC 2 Type II certification and supports HIPAA-compliant deployments for healthcare and regulated industry use cases. The platform's use of gVisor as a container runtime provides an additional layer of security isolation compared to standard containers, since gVisor intercepts system calls rather than sharing the host kernel directly.

Customers and Use Cases

Modal serves thousands of customers across a range of industries and use cases. Approximately 90% of workloads on the platform are related to AI and ML applications.

Customer	Use Case
Meta	Thousands of concurrent sandboxed environments for reinforcement learning
Scale AI	Evaluations, RL environments, and MCP servers with massive volume spikes
Substack	Deploying new ML models in hours instead of weeks
Ramp	AI-powered financial automation
Lovable	AI development platform
Quora	AI applications
Suno	Music generation
OpenPipe	LLM fine-tuning
Cognition AI	AI coding agent (Devin)
Mistral AI	AI inference
Harvey	Legal AI
Cartesia AI	Speech AI
Allen Institute for AI	Research computing

Common use case categories include:

Model inference APIs: Deploying LLMs, image generation models, and audio models as scalable endpoints
Fine-tuning: Training custom models on A100 and H100 GPUs with instant provisioning
Batch processing: Large-scale data transformation, web scraping, and ETL pipelines
Computer vision: Image and video processing with YOLO, Tesseract, and other frameworks
Sandboxed code execution: Running untrusted code from AI agents or end users safely
Computational biology: Protein folding, molecular modeling, and bioinformatics

Competition

Modal competes in the AI infrastructure and serverless compute market alongside several categories of providers.

Competitor	Category	Primary Differentiator
AWS Lambda / SageMaker	Major cloud provider	Comprehensive MLOps platform; deep AWS ecosystem integration; suited for large enterprises
Replicate	Serverless inference	One-click model deployment and sharing; strong for demos and prototyping; weaker cold starts (16 to 60+ seconds)
Baseten	Serverless inference	Open-source Truss framework for model packaging; good for early-stage teams; longer cold starts than Modal
Fireworks AI	Inference optimization	Custom inference engine with FireAttention; focused on production throughput and latency optimization
Together AI	Full-stack AI cloud	Broad model catalog (200+ models); training and inference; strong fine-tuning support
RunPod	GPU cloud	Raw GPU access with serverless and pod-based options; competitive pricing
Google Cloud Vertex AI	Major cloud provider	Integrated ML platform within Google Cloud ecosystem
CoreWeave	GPU cloud	Large-scale GPU clusters; focused on raw infrastructure rather than developer experience

Modal differentiates itself through three primary advantages:

Developer experience: Python-native infrastructure definition with no YAML, Dockerfiles, or cluster management. The decorator-based paradigm allows developers to go from local code to cloud execution with minimal changes.
Container startup speed: Sub-second to 4-second cold starts through custom gVisor-based containers, FUSE filesystem with content-addressed caching, and CRIU memory snapshotting. This is significantly faster than competitors like Replicate and Baseten, which can take 16 to 60+ seconds for custom models.
Usage-based pricing: Per-second billing with scale-to-zero means users pay nothing when their code is not running, unlike reserved GPU instances offered by traditional cloud providers.

Open Source Contributions

While Modal itself is a proprietary platform, the company and its founders have contributed to the open-source ecosystem. Erik Bernhardsson created and maintains several open-source projects:

Project	Description	GitHub Stars
Luigi	Python workflow scheduler for batch job pipelines (created at Spotify)	17,000+
Annoy	C++/Python library for approximate nearest neighbor search (created at Spotify)	13,000+
ANN-Benchmarks	Benchmarking framework for approximate nearest neighbor algorithms	4,000+

Modal's engineering team has also made upstream contributions to gVisor, specifically improving GPU support in the sandboxed container runtime.

Company Culture and Team

Modal operates from three offices: Soho in New York City, Norrmalm in Stockholm, and the Mission District in San Francisco. The engineering team is notable for its concentration of competitive programming talent, including multiple International Olympiad in Informatics gold medalists. The team also includes creators of popular open-source projects (Seaborn, Luigi) and experienced engineering leaders from companies like Spotify, Scale AI, and Better.com.

As of 2025, the company had approximately 79 employees and was actively hiring across engineering, go-to-market, finance, and operations roles.

References

Bernhardsson, Erik. "What I have been working on: Modal." erikbern.com, December 7, 2022. https://erikbern.com/2022/12/07/what-ive-been-working-on-modal.html
Modal Labs. "Announcing our $87M Series B." Modal Blog, September 2025. https://modal.com/blog/announcing-our-series-b
Modal Labs. "Series A Press Release." Modal Blog, October 2023. https://modal.com/blog/general-availability-and-series-a-press-release
TechCrunch. "Modal Labs lands $16M to abstract away big data workload infrastructure." October 10, 2023. https://techcrunch.com/2023/10/10/modal-labs-lands-16m-to-abstract-away-big-data-workload-infrastructure/
TechCrunch. "AI inference startup Modal Labs in talks to raise at $2.5B valuation, sources say." February 11, 2026. https://techcrunch.com/2026/02/11/ai-inference-startup-modal-labs-in-talks-to-raise-at-2-5b-valuation-sources-say/
Amplify Partners. "How Modal built a data cloud from the ground up." https://www.amplifypartners.com/blog-posts/how-modal-built-a-data-cloud-from-the-ground-up
Modal Labs. "How Modal speeds up container launches in the cloud." Modal Blog. https://modal.com/blog/speeding-up-container-launches
Modal Labs. "Pricing." https://modal.com/pricing
Modal Labs. "Company." https://modal.com/company
Oracle. "Modal Uses Oracle Cloud Infrastructure for Large-Scale AI Workloads." September 10, 2024. https://www.oracle.com/news/announcement/ocw24-modal-labs-uses-oracle-cloud-infrastructure-for-large-scale-ai-workloads-2024-09-10/
Contrary Research. "Modal Business Breakdown & Founding Story." https://research.contrary.com/company/modal
SiliconANGLE. "Modal Labs raises $80M to simplify cloud AI infrastructure with programmable building blocks." September 29, 2025. https://siliconangle.com/2025/09/29/modal-labs-raises-80m-simplify-cloud-ai-infrastructure-programmable-building-blocks/
Built In NYC. "AI-Native Infrastructure Company Modal Raises $87M." October 1, 2025. https://www.builtinnyc.com/articles/modal-raises-87m-series-b-1b-valuation-20251001
BeBeez. "Swedish Modal raises $87M Series B, hits unicorn valuation with AI-native cloud platform." September 30, 2025. https://bebeez.eu/2025/09/30/swedish-modal-raises-87m-series-b-hits-unicorn-valuation-with-ai-native-cloud-platform/
DZone. "Serverless Cloud Infrastructure With Modal." https://dzone.com/articles/modal-python-cloud-platform
Spotify. "Luigi: Python module for complex pipelines of batch jobs." GitHub. https://github.com/spotify/luigi
Spotify. "Annoy: Approximate Nearest Neighbors in C++/Python." GitHub. https://github.com/spotify/annoy

History and Founding

Erik Bernhardsson

Akshat Bubna

Early Development

Funding and Valuation

Platform and Technology

Core Architecture

Container Performance

Python SDK and Developer Experience

Products and Features

Inference

Training and Fine-Tuning

Sandboxes

Batch Processing

Web Endpoints

Scheduled Functions

Notebooks

Volumes and Storage

GPU Availability

Pricing

Compute Pricing

Plan Tiers

Credit Programs

Multi-Cloud Infrastructure

Security and Compliance

Customers and Use Cases

Competition

Open Source Contributions

Company Culture and Team

References

Related Articles

CoreWeave

Amazon Web Services

Vertex AI

Snowflake AI

Exa AI

Anyscale

History and Founding

Erik Bernhardsson

Akshat Bubna

Early Development

Funding and Valuation

Platform and Technology

Core Architecture

Container Performance

Python SDK and Developer Experience

Products and Features

Inference

Training and Fine-Tuning

Sandboxes

Batch Processing

Web Endpoints

Scheduled Functions

Notebooks

Volumes and Storage

GPU Availability

Pricing

Compute Pricing

Plan Tiers

Credit Programs

Multi-Cloud Infrastructure

Security and Compliance

Customers and Use Cases

Competition

Open Source Contributions

Company Culture and Team

References

Related Articles

CoreWeave

Amazon Web Services

Vertex AI

Snowflake AI

Exa AI

Anyscale