Cerebras Systems

Cerebras Systems is an American artificial intelligence hardware company that designs and manufactures wafer-scale processors for AI training and inference. Founded in March 2016 and headquartered in Sunnyvale, California, Cerebras is best known for producing the Wafer-Scale Engine (WSE), the largest chip ever made, which integrates an entire silicon wafer into a single processor. The company has positioned itself as a leading challenger to NVIDIA in the AI accelerator market, with a particular focus on high-speed inference for large language models. Cerebras completed an initial public offering on the Nasdaq under the ticker symbol CBRS on May 14, 2026, raising $5.55 billion in what became the largest global IPO of the year ^[18]^[19].

History and Founding

Cerebras Systems was incorporated in March 2016 by Andrew Feldman, Gary Lauterbach, Michael James, Sean Lie, and Jean-Philippe Fricker. The five-person founding team had worked together at the energy-efficient microserver company SeaMicro, which Feldman and Lauterbach had previously co-founded. SeaMicro was acquired by AMD in 2012 for $334 million, after which the core engineering group remained inside AMD until breaking away to start Cerebras. The founders brought decades of combined experience in networking, server design, and microprocessor architecture to the new venture ^[1]^[20].

The core insight behind Cerebras was that AI workloads, particularly deep learning training, could benefit from a processor built at a radically larger scale than conventional chips. Rather than designing a small chip and connecting many of them together (the approach used by GPUs), Cerebras decided to build a single chip spanning an entire 300mm silicon wafer. This approach eliminates the inter-chip communication bottlenecks that slow down distributed AI systems and allows the entire model to reside on-chip during computation.

The company operated in stealth mode for its first three years before unveiling the original Wafer-Scale Engine (WSE) at the Hot Chips conference in August 2019. During the stealth period the team focused on solving five interlocking engineering problems that had blocked wafer-scale integration for decades, including yield, power delivery, thermal management, cross-reticle wiring, and packaging. By the time Cerebras revealed its first chip, the company had already shipped early CS-1 systems to research customers and was working with the U.S. Department of Energy on scientific computing workloads ^[11].

Founders and key leadership

The founders of Cerebras have remained closely involved with the company since 2016, an unusual pattern for a hardware startup that scaled past ten years.

Founder	Role at Cerebras	Background
Andrew Feldman	Co-founder, CEO	BA Stanford 1991, MBA Stanford GSB 1997; previously CEO of SeaMicro, executive roles at Force10 Networks and RiverStone Networks
Gary Lauterbach	Co-founder, CTO	Former chief architect at Sun Microsystems and CTO of SeaMicro
Sean Lie	Co-founder, Chief Hardware Architect	Former lead architect at SeaMicro and AMD
Michael James	Co-founder, Chief Software Architect	Veteran compiler and runtime engineer with prior work at SeaMicro
Jean-Philippe Fricker	Co-founder, Chief System Architect	Systems and packaging engineer with prior work at SeaMicro

Feldman has been a serial entrepreneur in Silicon Valley, having sold three companies and taken a fourth public before founding Cerebras. He grew up on the campus of Stanford University, where his father was a faculty member, and he completed both undergraduate and graduate degrees there. Following the company's May 2026 IPO, Feldman's stake was valued at roughly $3.2 billion and Sean Lie's at roughly $1.7 billion, making both billionaires on paper ^[21].

Wafer-Scale Engine Technology

The defining innovation of Cerebras is the Wafer-Scale Engine, a processor that occupies an entire silicon wafer rather than being cut into individual dies like a traditional chip. This approach required Cerebras to solve several engineering challenges that had prevented wafer-scale integration for decades, including managing defects, distributing power evenly, and handling thermal dissipation across such a large area.

Engineering challenges and solutions

Building a chip the size of an entire wafer presented five major technical challenges that Cerebras had to overcome:

Defect tolerance. In traditional chip manufacturing, defective dies are simply discarded. On a wafer-scale chip, defects cannot be avoided but must be managed. Cerebras designed the WSE with approximately 100x the defect tolerance of a conventional GPU. Each AI core on the WSE-3 occupies roughly 0.05 mm2, about 1% the size of an NVIDIA H100 streaming multiprocessor (SM) core. When a defect hits a WSE core, it disables only 0.05 mm2 of silicon, whereas the same defect on an H100 would disable approximately 6 mm2. The WSE-3 contains 970,000 physical cores, with 900,000 active in the shipping product, meaning Cerebras achieves 93% silicon utilization, which is higher than leading GPUs despite building the world's largest chip ^[10].

Cerebras also developed a sophisticated routing architecture that allows dynamic reconfiguration of connections between cores. When a defect is detected during manufacturing test, the system automatically routes around the disabled core using redundant communication pathways, maintaining the fabric's full bandwidth and connectivity.

Power delivery. Delivering tens of kilowatts of power uniformly across a wafer-sized chip cannot be accomplished with traditional edge-of-die power connections. Cerebras designed a custom engine block that delivers power directly into the face of the wafer, achieving the power density required for hundreds of thousands of active cores. The custom connector and PCB design ensures uniform voltage across the entire wafer surface ^[11].

Thermal management. With overall power delivery in the mid-teen kilowatt range, the WSE generates substantial heat that must be removed uniformly. Traditional heat sink attachment techniques could not be used because of the thermal expansion mismatch between silicon and copper. Cerebras invented a new material and connector design that allows the wafer to expand and contract while remaining in thermal contact with a copper heat exchanger. Water flows through micro-fins on the backside of the heat exchanger, and the wafer slides against the polished front surface, maintaining thermal coupling despite differing coefficients of thermal expansion ^[11].

Cross-reticle connectivity. A standard photolithographic reticle can expose only a portion of a wafer at a time. To create a chip that spans the entire wafer, Cerebras had to connect circuits across reticle boundaries with high bandwidth and low latency, a problem unique to wafer-scale integration.

Die-to-die communication. The fabric interconnect that links all 900,000 cores must provide enormous aggregate bandwidth while consuming minimal power and area. The WSE-3's on-chip fabric provides 214 Pbit/s of bandwidth, enabling data to flow between any two cores on the wafer with predictable, low latency.

WSE-1 (2019)

The first-generation Wafer-Scale Engine was announced in August 2019. It featured 400,000 AI-optimized processing cores, 1.2 trillion transistors, and 18 gigabytes of on-chip SRAM. The CS-1 system, which housed the WSE-1, included twelve 100 Gigabit Ethernet connections for data transfer. At the time, it was by far the largest chip ever fabricated, with a die area of 46,225 square millimeters, roughly 56 times larger than the biggest GPU available. Argonne National Laboratory deployed the first CS-1 in 2019, becoming the inaugural national-lab customer for the technology ^[22].

WSE-2 (2021)

In April 2021, Cerebras announced the second-generation Wafer-Scale Engine (WSE-2), manufactured using TSMC's 7nm process. The WSE-2 represented a major leap in specifications:

Specification	WSE-1	WSE-2
Transistors	1.2 trillion	2.6 trillion
AI cores	400,000	850,000
On-chip SRAM	18 GB	40 GB
Memory bandwidth	9.6 PB/s	20 PB/s
Fabric bandwidth	100 Pb/s	220 Pb/s
Process node	16nm	7nm

The CS-2 system built around the WSE-2 became the primary commercial product for Cerebras, used by research institutions and enterprises for AI training. Notably, the 40 GB of on-chip SRAM eliminated the need for external high-bandwidth memory (HBM), allowing the entire working set of many AI models to reside directly on the processor.

WSE-3 (2024)

The third-generation Wafer-Scale Engine (WSE-3) was announced in March 2024 at a dedicated launch event and later presented in detail at the Hot Chips 2024 conference. Manufactured on TSMC's 5nm process, the WSE-3 is the most powerful AI chip ever built, containing approximately 4 trillion transistors across its 46,255 mm2 die area. Detailed coverage of the chip lives in the sibling article on the Cerebras WSE-3.

Specification	WSE-2	WSE-3
Transistors	2.6 trillion	4 trillion
AI cores (active)	850,000	900,000
Physical cores	~900,000	970,000
On-chip SRAM	40 GB	44 GB
Memory bandwidth	20 PB/s	21 PB/s
Fabric bandwidth	220 Pb/s	214 Pb/s
Peak AI performance	N/A	125 PFLOPS
Process node	7nm	5nm
Manufacturer	TSMC	TSMC

The WSE-3 delivers 125 petaflops of peak AI performance, which Cerebras claims is roughly double the performance of the WSE-2 at the same power consumption and cost. The memory bandwidth of 21 PB/s is approximately 7,000 times greater than the NVIDIA H100's off-chip HBM bandwidth, which is the fundamental advantage that enables Cerebras's inference speed records. Because the WSE stores model weights in on-chip SRAM distributed alongside the compute cores, data travels only fractions of a millimeter from memory to compute, rather than crossing package boundaries as it does in GPU architectures ^[3].

CS-2 and CS-3 Systems

The Cerebras CS-2 and CS-3 are complete computing systems designed to house the WSE-2 and WSE-3 chips, respectively. Each system integrates the wafer-scale processor with all necessary power delivery, cooling, and I/O connectivity in a single rack-mountable unit. The systems are designed to be straightforward to deploy, requiring only standard datacenter power and cooling.

CS-3 system specifications

The CS-3, powered by the WSE-3, delivers 125 petaflops of AI compute while consuming 23 kW of power per system. Key system-level specifications include:

Specification	CS-3
Processor	WSE-3 (4T transistors, 900K cores)
Peak AI compute	125 PFLOPS
On-chip memory	44 GB SRAM
System memory (with MemoryX)	Up to 1.2 PB
Power consumption	23 kW
Cooling	Direct liquid cooling
Max systems in cluster	2,048 (via SwarmX)
Max cluster compute	~0.25 zettaFLOPS

One of the key advantages of the CS systems is that they eliminate the need for complex multi-node distributed computing setups. A single CS-3, for instance, can train models that would otherwise require clusters of hundreds of GPUs, dramatically simplifying the software stack and reducing the engineering effort required to scale AI training.

MemoryX and SwarmX

Cerebras developed two complementary technologies to extend the CS systems beyond single-chip limitations:

MemoryX is an external memory system that extends the available memory for the WSE beyond the on-chip SRAM. With up to 1.2 petabytes of capacity, MemoryX enables the CS-3 to handle models with trillions of parameters by streaming weight data to the processor at high bandwidth. This is large enough to store models with 24 trillion parameters in a single logical memory space without partitioning or refactoring, enabling training of next-generation frontier models 10x larger than GPT-4 and Gemini ^[3].

SwarmX is a high-bandwidth fabric that connects multiple CS systems together. Up to 2,048 CS-3 systems can be linked via SwarmX to build hyperscale AI supercomputers delivering up to a quarter of a zettaFLOP. SwarmX maintains near-linear scaling efficiency, meaning that doubling the number of CS-3 systems nearly doubles aggregate performance.

Cerebras Inference

Cerebras Inference launched in August 2024 as a cloud-based inference service that leverages the unique architecture of the WSE to deliver what the company describes as the fastest LLM inference available. Because the WSE keeps the entire model on-chip (or streams it efficiently via MemoryX), it avoids the memory bandwidth bottleneck that limits GPU-based inference. At launch Cerebras claimed inference 10x to 20x faster than systems built using NVIDIA H100 Hopper GPUs, and the service has set successive speed records on each major open-source model since ^[23].

Inference performance benchmarks

The CS-3 has achieved remarkable inference speeds across a range of models, consistently setting records for tokens-per-second output:

Model	Tokens/second	Notes
Llama 3.1 8B	1,800	Single-user latency
Llama 3.1 70B	2,100	Per-user, roughly 8x faster than H200
Llama 3.1 405B	969	Largest open-source model at launch
Llama 4 Scout	2,600+	19x faster than fastest GPU at the time
Llama 4 Maverick	2,500+	Beat NVIDIA Blackwell on independent benchmarks
gpt-oss-120B	2,700+	Via Core42 partnership
K2 Think	2,000	Reasoning-optimized model

In head-to-head comparisons, Cerebras claims the CS-3 achieves over 21x faster inference than NVIDIA's flagship Blackwell B200 GPU on the Llama 3 70B model in a reasoning scenario with 1,024 input tokens and 4,096 output tokens. On the gpt-oss-120B model, the CS-3 delivered 2,700+ tokens/second compared to approximately 900 tokens/second on Blackwell B200 ^[4]. Independent benchmarks from third-party site Artificial Analysis have repeatedly placed Cerebras at the top of token-per-second rankings, with inference speeds up to 75 times faster than the same models served by hyperscalers such as Amazon, Microsoft, and Google ^[24].

The service supports popular open-source models including Llama 3.1 (8B, 70B, and 405B variants), Llama 4 Scout, Llama 4 Maverick, Mistral models, DeepSeek variants, and the gpt-oss family. In a flagship partnership with Mistral, Cerebras powered the Le Chat AI assistant at over 1,100 tokens per second, roughly ten times faster than ChatGPT at the time. Cerebras and Core42 (a subsidiary of G42) also launched global access to OpenAI's gpt-oss-120B model, serving it at approximately 3,000 tokens per second ^[9].

The speed advantage of Cerebras Inference has made it particularly attractive for real-time applications, agentic AI workflows, and scenarios where low latency directly impacts user experience. Cerebras planned to increase its inference capacity from 2 million to over 40 million tokens per second by Q4 2025, distributed across eight data centers spanning North America and Europe.

Meta Llama API partnership

In April 2025, Meta announced that Cerebras would power its new Llama API, the first cloud inference service offered directly by Meta for the Llama family of models. Developers selecting the Cerebras backend in the Llama API see generation speeds up to 18 times faster than traditional GPU-based offerings, with Llama 4 Scout running at 2,600 tokens per second compared to roughly 130 tokens per second for ChatGPT and 25 tokens per second for DeepSeek on equivalent endpoints ^[25]. The partnership represented a meaningful endorsement from Meta, the originator of the Llama model family, and channeled significant developer traffic to Cerebras hardware through Meta's own API surface.

Enterprise and developer customers

Throughout 2025 and 2026 Cerebras expanded its roster of paying enterprise customers across multiple industries:

Customer	Use case	Notes
Meta	Llama API backend	Powers Meta's official Llama inference cloud
Mistral	Le Chat assistant	1,100+ tokens/s, claimed consumer chat speed record
Notion	Enterprise search	Sub-300 ms results across 100M+ users
AlphaSense	Market intelligence	10x faster insights via Cerebras Inference
Cognition	Agentic coding	Powers Devin-style autonomous coding workflows
IBM	Enterprise AI	Integrates Cerebras into broader AI offerings
Mayo Clinic	Genomic foundation model	Rheumatoid arthritis treatment prediction
GlaxoSmithKline	Drug discovery	Protein and epigenomic transformer models
AstraZeneca	Drug discovery	Large biomedical training jobs
U.S. Department of Defense	Classified workloads	Disclosed in S-1 customer list

This breadth of customers, from frontier labs to regulated pharmaceutical firms to U.S. government agencies, became a central pillar of Cerebras's IPO marketing in 2026 ^[17]^[26].

Open-Source Models and Research

Cerebras has contributed several open-source models and research efforts to the AI community:

Cerebras-GPT: A family of GPT-3-like models ranging from 111 million to 13 billion parameters, trained on the open-source Pile dataset following DeepMind's Chinchilla scaling rules. These were among the first compute-optimal open-source language models.
BTLM-3B-8K: The Bittensor Language Model, a 3 billion parameter model developed in collaboration with Opentensor that achieved performance competitive with 7 billion parameter models while being significantly smaller and faster.
CrystalCoder-7B: Developed in partnership with Petuum and MBZUAI as part of the LLM360 initiative, which aimed to create fully transparent and reproducible large language models. CrystalCoder was trained on the Condor Galaxy 1 supercomputer.
Jais-30B and Med42: Arabic-language foundation models and a medical reasoning model jointly trained with G42 and MBZUAI on Condor Galaxy infrastructure.

G42 Partnership and Condor Galaxy

One of Cerebras's most significant strategic partnerships has been with G42, the Abu Dhabi-based AI technology holding company. The two companies announced a multi-billion-dollar deal in July 2023 to co-build a constellation of AI supercomputers and share infrastructure for training and inference. Together, they have built the Condor Galaxy series of AI supercomputers:

Condor Galaxy 1 (CG-1): Located in Santa Clara, California, CG-1 delivers 4 exaFLOPS of AI compute with 54 million AI-optimized compute cores. It was used to train several of Cerebras's open-source models and became available for external workloads.
Condor Galaxy 2 (CG-2): Also delivering 4 exaFLOPS with 54 million cores, CG-2 expanded the constellation's total capacity to 8 exaFLOPS and 108 million cores upon completion.
Condor Galaxy 3 (CG-3): Located in Dallas, Texas, CG-3 features 64 CS-3 systems powered by the WSE-3. It delivers 8 exaFLOPS of AI compute with 58 million AI-optimized cores, bringing the constellation total to 16 exaFLOPS.
Condor Galaxy India: In May 2026, G42 and the Government of India formalized a commercial framework to deploy an 8-exaFLOPS Condor Galaxy supercomputer in India built on Cerebras CS-3 systems, extending the constellation beyond North America for the first time ^[27].

System	Location	Compute	AI Cores	WSE Generation
CG-1	Santa Clara, CA	4 exaFLOPS	54 million	WSE-2
CG-2	Undisclosed (US)	4 exaFLOPS	54 million	WSE-2
CG-3	Dallas, TX	8 exaFLOPS	58 million	WSE-3
CG India	India	8 exaFLOPS	TBD	WSE-3
Full constellation (planned)	Global	36 exaFLOPS	-	Mixed

The full Condor Galaxy constellation of nine interconnected AI supercomputers is designed to deliver 36 exaFLOPS of AI compute by the end of 2026, making it one of the largest collections of interconnected AI supercomputers in the world.

The G42 partnership provided Cerebras with both a major customer and a development platform for demonstrating the capabilities of its wafer-scale technology at datacenter scale. In 2024, G42 accounted for roughly 85% of Cerebras's $290 million in revenue, a concentration that drew scrutiny from U.S. regulators during the IPO process. By the May 2026 amended S-1 filing, G42's direct share had fallen to 24% of 2025 revenue, with the Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), a related party of G42, accounting for an additional 62%, leaving total UAE-linked revenue at roughly 86% of 2025 sales ^[17]^[26].

CFIUS review and restructuring

The relationship with G42 created significant complications for Cerebras's IPO plans because of U.S. regulatory scrutiny of G42's prior business ties to China, including investments related to Huawei. After Cerebras filed its first S-1 in September 2024, the Committee on Foreign Investment in the United States (CFIUS) opened a review of G42's minority stake in Cerebras, focused on the risk that advanced AI chip technology could be transferred to China through UAE intermediaries. Cerebras paused its IPO process until the review concluded.

G42 divested its Chinese investments and signed a parallel partnership with Microsoft during 2024, both of which were designed to address national security concerns. CFIUS cleared the restructured arrangement on March 31, 2025, after G42 agreed to convert its Cerebras stake into non-voting shares, removing the company from Cerebras's voting investor base. This regulatory clearance unlocked Cerebras's ability to refile for an IPO and continue selling chips to G42 and its subsidiaries ^[16]^[28].

Cloud and hyperscaler partnerships

AWS partnership

In March 2026, Amazon Web Services signed a multiyear deal with Cerebras to make the WSE-3 wafer-scale chip available to cloud customers through Amazon Bedrock. AWS became the first major cloud provider to offer Cerebras's disaggregated inference solution, with plans to launch Cerebras hardware on Amazon Bedrock in the coming months and add open-source LLM support later in 2026. This partnership represented a significant validation of Cerebras's technology by one of the world's largest cloud providers and is described in the S-1 as a binding term sheet for ongoing capacity commitments ^[5]^[17].

OpenAI deal

In January 2026, Cerebras agreed to provide 750 megawatts of computing power to OpenAI through 2028. The transaction was structured as a Master Relationship Agreement (MRA) valued at more than $20 billion, roughly twice the $10 billion headline figure that initially circulated in press coverage. The MRA includes explicit expansion provisions that allow OpenAI to scale the commitment up to 2 gigawatts of capacity by 2030, which would make it among the largest single-customer AI infrastructure deals ever signed.

The deployment will roll out in multiple tranches starting in 2026 and make Cerebras the largest dedicated low-latency AI inference provider to OpenAI, supplementing OpenAI's existing capacity from NVIDIA GPUs and custom chips. Cerebras CEO Andrew Feldman described the deal as a turning point that validated wafer-scale silicon for production-scale frontier AI deployment, and analysts treated it as the centerpiece of the 2026 IPO narrative ^[29]^[30].

Department of Energy collaborations

Cerebras has had an active relationship with the U.S. Department of Energy (DOE) since 2019. Argonne National Laboratory deployed the first CS-1 system that year, and Lawrence Livermore National Laboratory followed shortly after. In 2022, a collaboration between Cerebras and Argonne National Laboratory used WSE-based models to study SARS-CoV-2 genomic dynamics, work that won a Gordon Bell Special Prize.

In December 2025, Cerebras signed a memorandum of understanding (MOU) with DOE to accelerate the White House Genesis Mission, a national AI initiative to apply AI to scientific discovery and national security. The Pittsburgh Supercomputing Center and the National Energy Technology Laboratory have also operated CS systems for HPC and AI workloads ^[31]^[32].

Data center footprint

Cerebras shifted from selling individual CS systems to operating its own inference clusters during 2024 and 2025, building out a North American and European footprint to support the Cerebras Inference cloud. Most facilities are operated jointly with regional data center partners.

Location	Partner	Status (May 2026)
Santa Clara, California	Cerebras	Operational (CG-1, inference)
Stockton, California	Nautilus Data Technologies	Operational (floating barge facility)
Dallas, Texas	G42 / Crusoe	Operational (CG-3)
Minneapolis, Minnesota	Undisclosed	Operational
Montreal, Canada	Bit Digital	Operational
Oklahoma City, Oklahoma	Scale Datacenters	Operational (10 MW)
Undisclosed Europe site	Undisclosed	Coming online 2026
Condor Galaxy India	G42	Announced May 2026

The Oklahoma City facility opened in late 2025 as a 10 MW site providing more than 44 exaFLOPS of AI compute. The Stockton barge facility, hosted at Nautilus's water-cooled platform on the San Joaquin River, became Cerebras's first west coast inference cluster outside Santa Clara. The expansion was specifically designed to push aggregate token-generation capacity past 40 million tokens per second by the end of 2025 ^[33].

Performance Comparison with GPU Clusters

The architectural differences between Cerebras's wafer-scale approach and traditional GPU clusters lead to fundamentally different performance profiles:

Metric	Cerebras CS-3	NVIDIA DGX B200 (8x B200)	Advantage
Inference latency (Llama 70B)	~2,100 tok/s per user	~250 tok/s per user	CS-3 ~8x faster
On-chip memory bandwidth	21 PB/s	~64 TB/s (8 GPUs combined)	CS-3 ~300x higher
Programming model	Single device	Distributed (tensor/pipeline parallelism)	CS-3 simpler
Power consumption	23 kW (system)	~10 kW (8 GPUs + system)	GPU cluster more efficient per watt
Training flexibility	Optimized via MemoryX/SwarmX	Industry-standard frameworks	GPUs more flexible
Software ecosystem	Cerebras SDK	CUDA/PyTorch/TensorFlow	GPUs much broader

The CS-3's primary advantage is inference latency, driven by the massive on-chip memory bandwidth that eliminates the memory wall problem. For training, GPU clusters retain advantages in software ecosystem maturity and flexibility, though Cerebras has made significant progress in training support through its compiler and MemoryX/SwarmX infrastructure.

Funding and Valuation

Cerebras has raised substantial capital through multiple funding rounds, including a final pre-IPO Series H in February 2026 that took the valuation to $23 billion:

Round	Date	Amount	Post-money valuation	Lead investors
Series A	May 2016	$27M	n/a	Benchmark, Foundation Capital, Eclipse Ventures
Series B	2017	$60M	n/a	Benchmark Capital
Series C	2018	$112M	n/a	Benchmark Capital
Series D	2019	$272M	$1.6B (unicorn)	Koch Disruptive Technologies
Series E	2021	$250M	n/a	Alpha Wave Ventures
Series F	Nov 2021	$720M	$4B	Alpha Wave Ventures, Abu Dhabi Growth Fund
Series G	Sept 2025	$1.1B	$8.1B	Fidelity, Atreides Management
Series H	Feb 2026	$1.0B	$23B	Undisclosed crossover investors
IPO	May 2026	$5.55B	~$48B at listing	Public markets

Total private funding before the IPO exceeded $4 billion. The Series H round, completed shortly after the OpenAI MRA was signed, served as the final crossover round before Cerebras filed its amended S-1 in April 2026 ^[14]^[26]^[34].

IPO Filing and Public Listing

Cerebras first filed for an IPO on the Nasdaq in September 2024 under the reserved ticker symbol CBRS. The original filing faced delays after CFIUS opened its review of the G42 stake, which forced Cerebras to withdraw the prospectus and pause the listing process. After restructuring its investor base and completing the Series G round, Cerebras refiled its S-1 on April 17, 2026, and amended it on May 4, 2026, with concrete pricing terms ^[15]^[35].

The IPO priced on May 13, 2026 at $185 per share, more than double the original target range, selling 30 million shares for total proceeds of $5.55 billion. Trading opened the following day at $350 and closed at roughly $311, up 68%, valuing Cerebras at around $95 billion on a fully diluted basis. The offering was the largest global IPO of 2026 at the time of pricing and one of the largest pure-play semiconductor listings in history ^[18]^[36].

Financial performance

The S-1 filings disclosed rapid revenue growth alongside continued operating losses:

Year	Revenue	GAAP net income (loss)	Notes
2022	$24.6M	n/a	Early commercial period
2023	$78.7M	n/a	First full year of G42 ramp
2024	$290.3M	$(481.6M)	Loss reflects share-based comp
2025	$510.0M	$237.8M	Operating loss of $145.9M

In the 2025 fiscal year, Cerebras reported $510 million in revenue, up 76% from 2024, and a GAAP net income of $237.8 million, though the company recorded a GAAP operating loss of $145.9 million and a non-GAAP net loss of roughly $75.7 million. Customer concentration remained the dominant risk factor in the S-1, with UAE-linked entities (G42 and MBZUAI combined) accounting for roughly 86% of 2025 revenue ^[17]^[26].

Competition and Market Position

Cerebras competes in the AI accelerator market against several established and emerging players:

Competitor	Approach	Key products
NVIDIA	GPU-based accelerators	H100, H200, B200, B300
AMD	GPU-based accelerators	MI300X, MI325X, MI350
Google	Custom ASICs	TPU v6 (Trillium), TPU v7 (Ironwood)
Groq	LPU inference accelerators	GroqChip1, LPU v2
SambaNova	Reconfigurable dataflow units (RDUs)	SN40L, SN50
Intel	Gaudi accelerators (discontinued)	Gaudi 3
Amazon	Custom ASICs	Trainium2, Trainium3

Cerebras differentiates itself through the sheer scale of its wafer-scale approach, the simplicity of programming a single massive chip versus a distributed cluster, and its focus on both training and inference performance. The company's inference speed records have been a particularly effective marketing tool, demonstrating tangible performance advantages over GPU-based solutions.

Cerebras vs Groq and SambaNova

The pure-play inference startups all chase a similar value proposition (orders-of-magnitude lower latency than NVIDIA GPUs on open-source LLMs), but they take very different architectural paths:

Vendor	Architecture	Llama 4 Maverick (tok/s)	Notable model
Cerebras	Wafer-scale SRAM	2,500+	gpt-oss-120B at 2,700+ tok/s
SambaNova	Reconfigurable Dataflow Units	794	DeepSeek-V3 at 198 tok/s
Groq	LPU streaming architecture	549	Llama 70B steady at ~275 tok/s
NVIDIA Blackwell	GPU clusters	~290 (cloud)	General-purpose flexibility

Cerebras has consistently topped Artificial Analysis token-per-second leaderboards for the largest open-source models, while Groq tends to win on consistency across context lengths and SambaNova on cost-per-token in dense reasoning workloads. NVIDIA continues to dominate training and the broader ecosystem, while Cerebras and its peers compete primarily on inference performance and total cost of ownership ^[37]^[38].

Strengths and limitations

Cerebras's wafer-scale approach offers several distinct advantages:

Elimination of the memory wall: On-chip SRAM bandwidth of 21 PB/s removes the bottleneck that limits GPU inference speed.
Programming simplicity: A single CS-3 replaces a cluster of GPUs, eliminating the need for complex distributed computing frameworks.
Near-linear scaling: SwarmX fabric enables efficient multi-system scaling without the communication overhead of GPU clusters.
Defect tolerance: Fine-grained core architecture provides higher effective yield than traditional chips.

Limitations include:

Software ecosystem: The CUDA ecosystem's maturity and breadth remain a significant advantage for NVIDIA GPUs.
Total memory capacity: While MemoryX addresses this, the on-chip SRAM of 44 GB is smaller than the HBM capacity of modern GPUs (80-288 GB per chip).
Training ecosystem: Most AI researchers are trained on GPU-based workflows, creating inertia against adoption.
Cost and availability: CS-3 systems are not yet as widely available as GPU-based alternatives.
Customer concentration: UAE-linked customers continue to dominate revenue, creating execution and geopolitical risk that public investors have flagged.

Current State

As of May 2026, Cerebras is in a strong position. The company completed the largest global IPO of 2026 on May 14, 2026, raising $5.55 billion at a $185 price and trading up roughly 68% on the first day to imply a fully diluted market value around $95 billion. Cerebras has secured anchor partnerships with AWS and OpenAI, continued to set inference speed records on every major open-source model, and is broadening its data center footprint to handle Meta's Llama API traffic and the OpenAI Master Relationship Agreement.

The Condor Galaxy constellation continues to grow, with CG-3 fully operational in Dallas and the Condor Galaxy India site announced in May 2026. The company is on track to deliver more than 40 million tokens per second of aggregate inference capacity by the end of 2026 and is rolling out additional facilities in Oklahoma City, Minneapolis, Montreal, and undisclosed European sites. Cerebras remains the most prominent pure-play challenger to NVIDIA in AI inference and one of only a handful of independent companies capable of supplying frontier-scale AI compute to hyperscalers and frontier labs at the gigawatt level.

References

"Cerebras Systems." Wikipedia. https://en.wikipedia.org/wiki/Cerebras
"Report: Cerebras Business Breakdown & Founding Story." Contrary Research. https://research.contrary.com/company/cerebras
"Cerebras Systems Unveils World's Fastest AI Chip with Whopping 4 Trillion Transistors." Cerebras Press Release. https://www.cerebras.ai/press-release/cerebras-announces-third-generation-wafer-scale-engine
"Cerebras CS-3 vs. Nvidia DGX B200 Blackwell." Cerebras Blog. https://www.cerebras.ai/blog/cerebras-cs-3-vs-nvidia-dgx-b200-blackwell
"AWS will bring Cerebras' wafer-size WSE-3 chip to its cloud platform." SiliconANGLE, March 13, 2026. https://siliconangle.com/2026/03/13/aws-will-bring-cerebras-wafer-size-wse-3-chip-cloud-platform/
"Report: AI chipmaker Cerebras Systems rekindles IPO plans, targeting early 2026 listing." SiliconANGLE, December 21, 2025. https://siliconangle.com/2025/12/21/report-ai-chipmaker-cerebras-systems-rekindles-ipo-plans-targeting-early-2026-listing/
"Cerebras powers Mistral's Le Chat to claim AI speed record." Digital Watch Observatory. https://dig.watch/updates/cerebras-powers-mistrals-le-chat-to-claim-ai-speed-record
"BTLM-3B-8K: 7B Performance in a 3 Billion Parameter Model." Cerebras Blog. https://www.cerebras.ai/blog/btlm-3b-8k-7b-performance-in-a-3-billion-parameter-model
"Core42 and Cerebras Deliver Record-Breaking Performance for OpenAI's gpt-oss-120B." G42. https://www.g42.ai/resources/news/core42-and-cerebras-deliver-record-breaking-performance-openais-gpt-oss-120b-powering-ai-innovation-enterprises-worldwide
"100x Defect Tolerance: How Cerebras Solved the Yield Problem." Cerebras Blog. https://www.cerebras.ai/blog/100x-defect-tolerance-how-cerebras-solved-the-yield-problem
"The five technical challenges Cerebras overcame in building the first trillion-transistor chip." TechCrunch. https://techcrunch.com/2019/08/19/the-five-technical-challenges-cerebras-overcame-in-building-the-first-trillion-transistor-chip/
"Cerebras Wafer-Scale AI." Hot Chips 2024 Presentation. https://hc2024.hotchips.org/assets/program/conference/day2/72_HC2024.Cerebras.Sean.v03.final.pdf
"Cerebras CS-3: the world's fastest and most scalable AI accelerator." Cerebras Blog. https://www.cerebras.ai/blog/cerebras-cs3
"Cerebras Systems Raises $1.1 Billion Series G at $8.1 Billion Valuation." BusinessWire, September 30, 2025. https://www.businesswire.com/news/home/20250930667206/en/
"Cerebras Launches World's Fastest Inference for Meta Llama 4." Cerebras Press Release. https://www.cerebras.ai/press-release/llama4PR
"Cerebras WSE-3: Third Generation Superchip for AI." IEEE Spectrum. https://spectrum.ieee.org/cerebras-chip-cs3
"Breaking down AI chipmaker Cerebras' S-1." PitchBook. https://pitchbook.com/news/articles/breaking-down-ai-chipmaker-cerebras-s-1
"Cerebras prices its shares at $185 ahead of biggest tech IPO in years, raising $5.5B." SiliconANGLE, May 13, 2026. https://siliconangle.com/2026/05/13/cerebras-prices-shares-185-ahead-biggest-tech-ipo-years/
"Cerebras pops 68% in Nasdaq debut, pushing the AI chipmaker's market cap to $95 billion." CNBC, May 14, 2026. https://www.cnbc.com/2026/05/14/cerebras-cbrs-stock-trade-nasdaq-ipo.html
"Cerebras Systems: The Company That Built a Chip the Size of a Dinner Plate." Mukund Mohan. https://mukundmohan.blog/2026/04/17/cerebras-systems-the-company-that-built-a-chip-the-size-of-a-dinner-plate-and-is-now-going-public/
"Cerebras CEO Feldman's Stake Hits $3.2 Billion After Year's Biggest IPO." Bloomberg, May 14, 2026. https://www.bloomberg.com/news/articles/2026-05-14/cerebras-ceo-turns-year-s-largest-ipo-into-3-2-billion-fortune
"Argonne National Laboratory Deploys Cerebras CS-1." Argonne National Laboratory. https://www.anl.gov/article/argonne-national-laboratory-deploys-cerebras-cs1-the-worlds-fastest-artificial-intelligence-computer
"Introducing Cerebras Inference: AI at Instant Speed." Cerebras Blog. https://www.cerebras.ai/blog/introducing-cerebras-inference-ai-at-instant-speed
"Cerebras Shatters Inference Records: Llama 3.1 405B Hits 969 Tokens Per Second." FinancialContent. https://markets.financialcontent.com/wral/article/tokenring-2026-1-1-cerebras-shatters-inference-records-llama-31-405b-hits-969-tokens-per-second-redefining-real-time-ai
"Meta Collaborates with Cerebras to Drive Fast Inference for Developers in New Llama API." Cerebras Press Release. https://www.cerebras.ai/press-release/meta-collaborates-with-cerebras-to-drive-fast-inference-for-developers-in-new-llama-api
"Cerebras S-1/A." SEC, May 2026. https://www.sec.gov/Archives/edgar/data/2021728/000162828026033143/cerebras-sx1a2.htm
"G42 and Government of India Formalize Commercial Framework for Condor Galaxy India." HPCwire. https://www.hpcwire.com/off-the-wire/g42-and-government-of-india-formalize-commercial-framework-for-condor-galaxy-india-ai-supercomputer/
"CFIUS Approves Cerebras Systems Shares Sale to UAE-Based AI Firm G42." GovCon Exec International, April 2025. https://www.govconexec.com/2025/04/cerebras-receives-regulatory-approval-for-shares-sale/
"OpenAI partners with Cerebras." OpenAI, January 14, 2026. https://openai.com/index/cerebras-partnership/
"Cerebras scores OpenAI deal worth over $10 billion ahead of AI chipmaker's IPO." CNBC, January 14, 2026. https://www.cnbc.com/2026/01/14/cerebras-scores-openai-deal-worth-over-10-billion.html
"Cerebras Systems and U.S. Department of Energy Sign MOU." BusinessWire, December 2025. https://www.businesswire.com/news/home/20251218894367/en/Cerebras-Systems-and-U.S.-Department-of-Energy-Sign-MOU-to-Accelerate-the-Genesis-Mission-and-U.S.-National-AI-Initiative
"Department of Energy and Cerebras Systems Partner to Accelerate Science." BusinessWire, September 2019. https://www.businesswire.com/news/home/20190917005356/en/Department-of-Energy-and-Cerebras-Systems-Partner-to-Accelerate-Science-with-Supercompute-Scale-Artificial-Intelligence
"Cerebras Announces Six New AI Datacenters Across North America and Europe." Cerebras Press Release. https://www.cerebras.ai/press-release/cerebras-announces-six-new-ai-datacenters-across-north-america-and-europe-to-deliver-industry-s
"Cerebras Systems raises $1bn at $23bn valuation." Data Center Dynamics, February 2026. https://www.datacenterdynamics.com/en/news/cerebras-systems-raises-1bn-at-23bn-valuation/
"Cerebras - S-1 (April 2026)." SEC. https://www.sec.gov/Archives/edgar/data/2021728/000162828026025762/cerebras-sx1april2026.htm
"Cerebras IPO mints two billionaires, sets stage for potential AI wave." CNBC, May 14, 2026. https://www.cnbc.com/2026/05/14/cerebras-ipo-mints-two-billionaires-sets-stage-for-potential-ai-wave.html
"Cerebras vs SambaNova vs Groq: AI Chip Comparison." IntuitionLabs. https://intuitionlabs.ai/articles/cerebras-vs-sambanova-vs-groq-ai-chips
"Cerebras CS-3 vs. Groq LPU." Cerebras Blog. https://www.cerebras.ai/blog/cerebras-cs-3-vs-groq-lpu
"A Comparison of the Cerebras Wafer-Scale Integration Technology with Nvidia GPU-based Systems for Artificial Intelligence." arXiv. https://arxiv.org/html/2503.11698v1

History and Founding

Founders and key leadership

Wafer-Scale Engine Technology

Engineering challenges and solutions

WSE-1 (2019)

WSE-2 (2021)

WSE-3 (2024)

CS-2 and CS-3 Systems

CS-3 system specifications

MemoryX and SwarmX

Cerebras Inference

Inference performance benchmarks

Meta Llama API partnership

Enterprise and developer customers

Open-Source Models and Research

G42 Partnership and Condor Galaxy

CFIUS review and restructuring

Cloud and hyperscaler partnerships

AWS partnership

OpenAI deal

Department of Energy collaborations

Data center footprint

Performance Comparison with GPU Clusters

Funding and Valuation

IPO Filing and Public Listing

Financial performance

Competition and Market Position

Cerebras vs Groq and SambaNova

Strengths and limitations

Current State

References

Improve this article

Related Articles

Machine learning terms/Fairness

Graphics processing unit

Groq

Edge AI

NVIDIA Digits

AI Datacenter

History and Founding

Founders and key leadership

Wafer-Scale Engine Technology

Engineering challenges and solutions

WSE-1 (2019)

WSE-2 (2021)

WSE-3 (2024)

CS-2 and CS-3 Systems

CS-3 system specifications

MemoryX and SwarmX

Cerebras Inference

Inference performance benchmarks

Meta Llama API partnership

Enterprise and developer customers

Open-Source Models and Research

G42 Partnership and Condor Galaxy

CFIUS review and restructuring

Cloud and hyperscaler partnerships

AWS partnership

OpenAI deal

Department of Energy collaborations

Data center footprint

Performance Comparison with GPU Clusters

Funding and Valuation

IPO Filing and Public Listing

Financial performance

Competition and Market Position

Cerebras vs Groq and SambaNova

Strengths and limitations

Current State

References

Related Articles

Machine learning terms/Fairness

Graphics processing unit

Groq

Edge AI

NVIDIA Digits

AI Datacenter