SambaNova Systems
Last reviewed
May 18, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v6 · 4,196 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 18, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v6 · 4,196 words
Add missing citations, update stale details, or suggest a clearer explanation.
SambaNova Systems is an American artificial intelligence hardware and software company headquartered in Palo Alto, California. Founded in November 2017 by Stanford University professors Kunle Olukotun and Christopher Ré along with semiconductor industry veteran Rodrigo Liang, the company designs custom AI processors called Reconfigurable Dataflow Units (RDUs) and builds full-stack platforms for inference of large language models.[1] Following an April 2025 restructuring, SambaNova repositioned as an inference-focused company built around the SN40L chip, the SambaNova Cloud service launched in September 2024, and the SambaManaged turnkey data-center platform launched in July 2025.[2][3] SambaNova has raised approximately $1.5 billion in venture funding from investors including SoftBank Vision Fund, Vista Equity Partners, Intel Capital, BlackRock, and GV. The company competes with NVIDIA, Groq, and Cerebras Systems in the AI accelerator market.[4]
SambaNova Systems was incorporated in November 2017 by three co-founders with deep roots in processor architecture and machine learning research at Stanford University. Kunle Olukotun, the Cadence Design Professor of Electrical Engineering and Computer Science at Stanford, is widely recognized as a pioneer of multicore processor design. His earlier company, Afara Websystems, developed the Niagara multi-core chip that was acquired by Sun Microsystems in 2002 and became the foundation of Oracle's SPARC server line.[5] Christopher Ré, an associate professor of computer science at Stanford and leader of the Hazy Research group, brought expertise in machine learning systems and data management. Ré co-founded Snorkel AI and two other startups later acquired by Apple (Lattice/DeepDive and Inductiv/HoloClean).[6] Rodrigo Liang, who serves as CEO, previously spent over 15 years at Sun Microsystems and Oracle, where he rose to Senior Vice President of SPARC Processor and ASIC Development and oversaw the release of 12 major processor designs.[1]
The founding thesis behind SambaNova was that the dominant computing architectures of the time, originally designed for general-purpose workloads, were fundamentally ill-suited for the dataflow patterns of deep learning. Rather than adapting existing GPU or CPU designs, the co-founders set out to build an entirely new processor architecture from scratch, optimized specifically for AI and machine learning operations.
The company operated in stealth mode during its first year while assembling a team of chip architects, compiler engineers, and ML researchers. In March 2018, SambaNova emerged from stealth with a $56 million Series A round led by GV (formerly Google Ventures), with participation from Walden International, Redline Capital Management, and Atlantic Bridge. Walden International is the venture capital firm founded by Lip-Bu Tan, who would later become Intel's CEO in March 2025 and a key figure in Intel's subsequent dealings with SambaNova; Tan continues to serve as executive chair of SambaNova.[7]
In April 2019, SambaNova closed a $150 million Series B round led by Intel Capital, with continued participation from GV and existing investors. The funding accelerated development of the company's first-generation chip, the SN10 RDU, which was taped out on TSMC's 7nm process in the first half of 2019.[8]
The company's first commercial product, the DataScale SN10 system, began shipping to customers in early 2020. In February 2020, SambaNova announced a $250 million Series C round led by funds and accounts managed by BlackRock, with participation from GV, Intel Capital, Walden International, WRVI Capital, and Redline Capital.
By 2021, SambaNova had secured deployments at several U.S. Department of Energy national laboratories. In April 2021, the company raised a $676 million Series D round led by SoftBank Vision Fund 2, with participation from new investors Temasek and GIC alongside existing backers including BlackRock, Intel Capital, GV, and Walden International. This round valued SambaNova at over $5 billion, making it one of the most highly valued AI chip startups at the time. Total funding after the Series D exceeded $1.1 billion.[9]
In September 2022, SambaNova launched its second-generation system, the DataScale SN30, powered by the Cardinal SN30 RDU. The SN30 used multi-die packaging to double compute capacity by combining two RDU dies into a single socket. Built on TSMC's 7nm process, the SN30 contained 86 billion transistors and delivered 688 teraflops at bfloat16 precision. SambaNova reported that the SN30 achieved a 6x speedup over comparable NVIDIA A100-based systems on certain training workloads.[10]
In September 2023, SambaNova unveiled the SN40L, its fourth-generation RDU built on TSMC's 5nm process. The SN40L introduced a three-tier memory hierarchy combining 520 MiB of on-chip PMU SRAM, 64 GiB of co-packaged HBM, and up to 1.5 TiB of DDR DRAM via pluggable DIMMs. This design allowed the chip to hold much larger models in memory than competing architectures and was specifically co-designed for both training and inference of large foundation models. The SN40L featured 1,040 Pattern Compute Units (PCUs) and could handle Composition of Experts configurations with up to 5 trillion parameters.[11]
This period also marked a strategic shift in the company's focus. While SambaNova originally positioned itself as a training platform, the rapid growth of generative AI and the explosion of inference demand led the company to reposition around high-performance AI inference as its primary market.
On September 10, 2024, SambaNova launched SambaNova Cloud, a cloud-based AI inference service powered by the SN40L. At launch, the platform offered Meta's Llama 3.1 405B model at 132 output tokens per second and the 70B variant at 461 tokens per second at full 16-bit precision. Artificial Analysis independently verified these speeds, confirming them as the fastest available from any cloud API provider at the time. SambaNova noted that the best GPU-based providers at the time served the 405B model at no more than 72 tokens per second.[12][13]
On April 22, 2025, SambaNova confirmed that it had laid off 77 employees, approximately 15% of its roughly 500-person workforce, as part of a strategic pivot away from training workloads and toward pure-play AI inference. In a statement, the company said that "this past week, SambaNova made changes to align with today's market conditions and the transition we've seen from model training to fine-tuning and inference," and that the resulting organization would focus on "delivering cloud-first solutions that help enterprises and developers deploy open-source models at scale."[14][15] Industry analysts framed the move as a recognition that NVIDIA's training-market dominance had become structurally entrenched, while inference represented a larger and more contestable opportunity.[14]
On May 29, 2025, SambaNova made its AI platform available in the AWS Marketplace as a SaaS offering, allowing customers to subscribe using their existing AWS accounts and connect to SambaNova Cloud via AWS PrivateLink. The launch covered models including Meta's Llama 4 Maverick and DeepSeek R1 671B running on SN40L hardware.[16]
On July 8, 2025, SambaNova introduced SambaManaged, which it described as the industry's first turnkey, inference-optimized data-center product, deployable in 90 days versus the 18-24 months typical for purpose-built AI infrastructure. SambaManaged is a modular solution scalable up to a 1 MW "token factory" with up to 100 racks and 1,600 SN40L chips, designed to run on as little as 10 kW of air-cooled power per rack and operate inside conventional, non-liquid-cooled data centers.[3]
Alongside SambaManaged, the company unveiled a "SambaNova 2.0" brand and product reorganization built around three offerings - SambaCloud (the public inference cloud), SambaStack (dedicated hardware in cloud or on-prem), and SambaManaged (turnkey data-center systems) - all powered by SambaRack units containing 16 SN40L RDUs and orchestrated by a Kubernetes-based control plane called SambaOrchestrator.[17]
On December 12, 2025, Bloomberg reported that Intel was in advanced talks to acquire SambaNova for approximately $1.6 billion including debt, a dramatic markdown from its $5+ billion 2021 valuation.[18] If completed, the deal would have been the first major acquisition under Intel CEO Lip-Bu Tan, whose Walden International had led SambaNova's 2018 Series A and who continued to chair the SambaNova board. The talks ultimately collapsed, and SambaNova pursued an independent funding round instead.[7][19]
In February 2026, SambaNova announced a $350 million Series E round led by Vista Equity Partners and Cambium Capital. Intel Capital, T. Rowe Price, BlackRock, and Battery Ventures also participated, with Intel contributing approximately $100-150 million for an estimated 9% stake. The round closed at an implied valuation of approximately $2.2 billion, well below the company's 2021 peak but reportedly above the canceled Intel acquisition price.[4][20] As part of the announcement, SambaNova and Intel established a multi-year collaboration to co-develop AI inference systems combining SambaNova's RDU accelerators with Intel's Xeon CPUs, with Intel co-marketing and co-selling the resulting platforms through its enterprise sales channels.[21] This round brought SambaNova's total funding to approximately $1.5 billion.
Alongside the funding announcement, SambaNova unveiled the SN50, its fifth-generation RDU purpose-built for agentic AI inference, with shipments expected in the second half of 2026.[22] SoftBank signed on as one of the first customers for the SN50.[4]
At the core of SambaNova's technology is its Reconfigurable Dataflow Architecture (RDA), which takes a fundamentally different approach from the SIMT (Single Instruction, Multiple Threads) model used by GPUs. In a traditional GPU, a program counter drives execution across thousands of threads that process data stored in a memory hierarchy. In SambaNova's dataflow model, data flows through a network of reconfigurable compute and memory units, with the computation defined by the structure of the graph rather than a sequential instruction stream.[11]
The RDA is built around three primary on-chip components:
| Component | Function |
|---|---|
| Pattern Compute Units (PCUs) | Perform arithmetic and logic operations. Each PCU contains six SIMD stages with 16 lanes each, providing 96 functional units per PCU. |
| Pattern Memory Units (PMUs) | Serve as distributed on-chip SRAM for storing model weights, activations, and intermediate data close to computation. |
| Switches | Form a high-speed, three-dimensional on-chip switching fabric that routes data between PCUs and PMUs with minimal latency. |
This architecture allows the compiler to map the dataflow graph of a neural network directly onto the hardware, eliminating many of the inefficiencies associated with GPU kernel launches, memory copies, and thread synchronization. Because the units are reconfigurable, the same physical hardware can be reprogrammed for different model architectures without the fixed-function limitations of traditional ASICs.
SambaNova has released multiple generations of its Reconfigurable Dataflow Unit (RDU), each building on the dataflow architecture with improvements in process technology, memory capacity, and compute density.
| Generation | Chip | Process Node | Key Specifications | Year |
|---|---|---|---|---|
| 1st | SN10 | TSMC 7nm | 640 PCUs, 640 PMUs, 320 MB SRAM, 300+ BF16 TFLOPS, 40B transistors, up to 1.5 TB DDR | 2020 |
| 2nd | SN30 | TSMC 7nm | Multi-die (2x SN10 dies), 86B transistors, 688 BF16 TFLOPS | 2022 |
| 4th | SN40L | TSMC 5nm | 1,040 PCUs, 520 MiB SRAM, 64 GiB HBM, up to 1.5 TiB DDR, 638 BF16 TFLOPS, 10.2 PFLOPS per rack | 2023 |
| 5th | SN50 | TSMC 3nm (N3) | Dual-chiplet, ~2,080 PCUs, 432 MB SRAM, 64 GB HBM2E (1.8 TB/s), 256 GB-2 TB DDR5, 1.6 PFLOPS BF16, 3.2 PFLOPS FP8, up to 256-chip scaling | 2026 |
The SN10 was presented at Hot Chips 33 in August 2021, where SambaNova published a detailed IEEE paper on its 7nm dataflow architecture.[23] The SN40L was the subject of a 2024 research paper on arXiv describing its approach to scaling the "AI memory wall" through its three-tier memory system and Composition of Experts.[11]
The SN50, announced in February 2026 and built on TSMC's 3nm process with a dual-chiplet design, delivers 2.5x the 16-bit floating-point performance and 5x the FP8 performance of the SN40L. A single SambaRack SN50 combines 16 SN50 chips and averages approximately 20 kW of power, allowing operation inside existing air-cooled data centers. Up to 256 accelerators can be interconnected across multiple racks using a multi-terabit-per-second switched fabric with 2.2 TB/s of bidirectional chip-to-chip bandwidth. The SN50 supports models with up to 10 trillion parameters and context lengths of up to 10 million tokens. SambaNova claims the SN50 delivers approximately 5x the peak speed and over 3x the throughput of NVIDIA's B200 GPU on agentic-inference workloads.[22][21]
SambaFlow is SambaNova's compiler and runtime software stack that bridges standard ML frameworks and the RDU hardware. Developers write models in PyTorch or TensorFlow, and SambaFlow automatically extracts the computational graph, optimizes it for dataflow execution, and maps it onto the RDU's PCUs and PMUs.
The SambaFlow stack includes:
Because the compiler handles low-level optimization, developers do not need to write custom CUDA kernels or hand-tune memory layouts, which SambaNova positions as an advantage over GPU-based workflows that often require significant kernel engineering for peak performance.
Following the July 2025 "SambaNova 2.0" reorganization, SambaNova offers three primary product lines, all powered by SambaRack hardware containing 16 SN40L RDUs (with SN50-based SambaRacks rolling out from late 2026) and orchestrated by a Kubernetes-based platform called SambaOrchestrator.[17]
SambaCloud (originally launched as "SambaNova Cloud" in September 2024 and renamed in July 2025) is the company's cloud-hosted inference service. It provides API access to popular open-source large language models running on SambaRack hardware in SambaNova-operated data centers, with an emphasis on inference speed.[2][17]
The service offers an OpenAI-compatible API, making integration straightforward for developers familiar with that interface. Available models have included Meta's Llama family (3.1, 3.3, and 4 Maverick), DeepSeek R1 and V3, Alibaba's Qwen models, OpenAI's gpt-oss family, and others.[16]
Notable speed benchmarks achieved on SambaCloud:
| Model | Output Speed (tokens/sec) |
|---|---|
| Llama 3.1 8B | 1,000+ |
| Llama 3.1 70B | 461-580 |
| Llama 3.1 405B | 100-132 |
| DeepSeek R1 671B | 255 |
| DeepSeek V3-0324 | 250 |
| MiniMax M2 (230B) | 378 |
SambaCloud is available in three tiers: a Free Tier with $5 in starting credits (approximately 30 million tokens on Llama 8B), a Developer Tier with pay-as-you-go billing, and an Enterprise Tier with dedicated support and custom configurations. Since May 2025, SambaCloud has also been available as a SaaS subscription via the AWS Marketplace with secure connectivity through AWS PrivateLink.[16]
SambaStack is SambaNova's dedicated-hardware offering, succeeding the earlier DataScale appliances. Customers can purchase SambaRack systems for on-premises deployment or rent dedicated hardware capacity hosted in SambaNova's cloud, with the SambaFlow stack pre-installed for fine-tuning, deploying, and managing AI models.[17]
| System | Configuration | Form Factor |
|---|---|---|
| DataScale SN10-8 (legacy) | 8x SN10 | Quarter rack |
| DataScale SN30 (legacy) | Multi-die SN30 | Rack-scale |
| SambaRack SN40L | 16x SN40L | Single rack, ~10 kW |
| SambaRack SN50 | 16x SN50 (2026+) | Single rack, ~20 kW |
SambaManaged, launched on July 8, 2025, is a turnkey AI-inference offering for third-party data-center operators and "neocloud" providers. SambaNova owns and operates the AI hardware and stack while the host data center provides power, floor space, and connectivity, allowing the host to begin selling AI inference services in approximately 90 days instead of the 18-24 months typically required to design and stand up a new AI-optimized facility.[3]
Key characteristics of SambaManaged include:
The Composition of Experts (CoE) is a model-serving architecture developed by SambaNova that allows multiple specialized, fully trained models ("experts") to be orchestrated behind a single API endpoint. Unlike a mixture of experts architecture where expert sub-networks exist within a single model, CoE treats each expert as a standalone model and routes queries to the appropriate expert based on the task.[11]
Key characteristics of CoE include:
SambaNova released Samba-CoE v0.1 and subsequent versions as demonstrations of this architecture, showing that a collection of smaller specialized models can match or exceed the performance of much larger general-purpose models on domain-specific tasks.
SambaNova has established a significant presence in the U.S. national laboratory system, making it one of the most widely deployed AI chip startups in government research facilities.
| Institution | Deployment Details |
|---|---|
| Argonne National Laboratory | Deployed DataScale systems as part of the AI Testbed at the Argonne Leadership Computing Facility. Used to train convolutional neural networks on images exceeding 50,000 x 50,000 pixel resolution, a task that was infeasible on GPU-based systems due to memory constraints. |
| Lawrence Livermore National Laboratory (LLNL) | Expanded collaboration announced in May 2023 to bring SambaNova's spatial dataflow accelerators into LLNL's Computing Center, supporting the lab's cognitive simulation program for improving speed and accuracy of scientific research.[24] |
| Los Alamos National Laboratory | Deployed SambaNova systems for AI-driven scientific computing workloads. |
| Oak Ridge National Laboratory | Integrated SambaNova hardware for AI research applications. |
| Texas Advanced Computing Center (TACC) | Selected SambaNova AI systems to accelerate scientific research across multiple disciplines. |
Beyond government laboratories, SambaNova has deployed systems and services to a range of enterprise customers:
SambaNova has completed five major funding rounds since its founding:
| Round | Date | Amount | Lead Investor(s) | Post-Money Valuation |
|---|---|---|---|---|
| Series A | March 2018 | $56 million | GV | Not disclosed |
| Series B | April 2019 | $150 million | Intel Capital | Not disclosed |
| Series C | February 2020 | $250 million | BlackRock | Not disclosed |
| Series D | April 2021 | $676 million | SoftBank Vision Fund 2 | $5.1 billion[9] |
| Series E | February 2026 | $350 million+ | Vista Equity Partners, Cambium Capital | ~$2.2 billion[20] |
| Total | ~$1.5 billion |
Notable investors across all rounds include SoftBank Vision Fund, Vista Equity Partners, Intel Capital, GV (Google Ventures), BlackRock, T. Rowe Price, Battery Ventures, Temasek, GIC, Walden International, WRVI Capital, Redline Capital, Atlantic Bridge, and Cambium Capital.
The company's peak valuation of over $5 billion came with its 2021 Series D. By late 2025, its private market valuation had declined to approximately $1.6 billion amid increased competition and a challenging fundraising environment for AI hardware startups, prompting Intel acquisition discussions that ultimately stalled.[18][19] The February 2026 Series E, along with the Intel partnership, provided a path forward as an independent company at a roughly $2.2 billion valuation.[20]
SambaNova operates in a highly competitive market for AI inference hardware. Its primary competitors include both established semiconductor companies and venture-backed startups.
| Company | Architecture | Key Differentiator |
|---|---|---|
| NVIDIA | GPU (CUDA) | Dominant market position with 80%+ share in AI accelerators. H100 and B200 GPUs are the industry standard. Massive software ecosystem with CUDA. |
| Groq | LPU (Language Processing Unit) | Deterministic, single-core architecture optimized for inference latency. |
| Cerebras Systems | Wafer-Scale Engine (WSE) | Largest chip ever built, using an entire silicon wafer. |
| AMD | GPU (ROCm) | MI300X and MI355X accelerators competing directly with NVIDIA on price-performance for inference. |
| TPU | Custom ASIC used internally and offered through Google Cloud. | |
| Intel | Gaudi / Xeon | Gaudi accelerators for AI training and inference; now partnering with SambaNova rather than competing directly through the 2026 collaboration.[21] |
SambaNova differentiates itself through its dataflow architecture, which avoids the kernel-launch overhead and memory bottlenecks common in GPU-based inference. The company's three-tier memory hierarchy (SRAM, HBM, DDR) allows it to keep many large models resident in memory simultaneously, which is particularly advantageous for multi-model serving and agentic AI workloads that require rapid switching between different models.
The AI inference market is projected to grow from approximately $106 billion in 2025 to $255 billion by 2030, though NVIDIA, AMD, and cloud providers' custom ASICs are expected to retain the majority of market share. SambaNova, Groq, and Cerebras collectively target the remaining segment with specialized architectures that offer advantages in speed, efficiency, or scale for specific workload types.
| Name | Title | Background |
|---|---|---|
| Rodrigo Liang | Co-Founder and CEO | Stanford BS/MS in Electrical Engineering. Former SVP of SPARC Processor and ASIC Development at Oracle. Over 15 years at Sun Microsystems/Oracle leading chip design teams. |
| Kunle Olukotun | Co-Founder and Chief Technologist | Cadence Design Professor of Electrical Engineering and Computer Science at Stanford. Pioneer of multicore processor design. Founded Afara Websystems (acquired by Sun Microsystems). ACM Fellow, IEEE Fellow, member of the National Academy of Engineering. Recipient of the 2023 ACM-IEEE CS Eckert-Mauchly Award.[5] |
| Christopher Ré | Co-Founder | Associate Professor of Computer Science at Stanford. Leader of the Hazy Research group. Co-founder of Snorkel AI and two companies acquired by Apple (Lattice/DeepDive and Inductiv/HoloClean). MacArthur Fellow.[6] |
| Lip-Bu Tan | Executive Chair | Founder of Walden International (early SambaNova investor); CEO of Intel since March 2025.[7] |