SambaNova Systems is an American artificial intelligence hardware and software company headquartered in Palo Alto, California. Founded in 2017 by Stanford University professors Kunle Olukotun and Christopher Ré along with semiconductor industry veteran Rodrigo Liang, the company designs custom AI processors called Reconfigurable Dataflow Units (RDUs) and builds full-stack platforms for training and inference of large language models. SambaNova has raised over $1.4 billion in venture funding from investors including SoftBank Vision Fund, Intel Capital, Vista Equity Partners, BlackRock, and GV. The company competes with NVIDIA, Groq, and Cerebras Systems in the AI accelerator market.
SambaNova Systems was incorporated in November 2017 by three co-founders with deep roots in processor architecture and machine learning research at Stanford University. Kunle Olukotun, a professor of electrical engineering and computer science at Stanford, is widely recognized as a pioneer of multicore processor design. His earlier company, Afara Websystems, developed the Niagara multi-core chip that was acquired by Sun Microsystems in 2002 and became the foundation of Oracle's SPARC server line. Christopher Ré, an associate professor of computer science at Stanford and leader of the Hazy Research group, brought expertise in machine learning systems and data management. Ré is also known for co-founding Snorkel AI and two other startups later acquired by Apple. Rodrigo Liang, who serves as CEO, previously spent over 15 years at Sun Microsystems and Oracle, where he rose to Senior Vice President of SPARC Processor and ASIC Development and oversaw the release of 12 major processor designs.
The founding thesis behind SambaNova was that the dominant computing architectures of the time, originally designed for general-purpose workloads, were fundamentally ill-suited for the dataflow patterns of deep learning. Rather than adapting existing GPU or CPU designs, the co-founders set out to build an entirely new processor architecture from scratch, optimized specifically for AI and machine learning operations.
The company operated in stealth mode during its first year while assembling a team of chip architects, compiler engineers, and ML researchers. In March 2018, SambaNova emerged from stealth with a $56 million Series A round led by GV (formerly Google Ventures), with participation from Walden International, Redline Capital Management, and Atlantic Bridge.
In April 2019, SambaNova closed a $150 million Series B round led by Intel Capital, with continued participation from GV and existing investors. The funding accelerated development of the company's first-generation chip, the SN10 RDU, which was taped out on TSMC's 7nm process in the first half of 2019.
The company's first commercial product, the DataScale SN10 system, began shipping to customers in early 2020. In February 2020, SambaNova announced a $250 million Series C round led by funds and accounts managed by BlackRock, with participation from GV, Intel Capital, Walden International, WRVI Capital, and Redline Capital.
By 2021, SambaNova had secured deployments at several U.S. Department of Energy national laboratories. In April 2021, the company raised a $676 million Series D round led by SoftBank Vision Fund 2, with participation from new investors Temasek and GIC alongside existing backers including BlackRock, Intel Capital, GV, and Walden International. This round valued SambaNova at over $5 billion, making it one of the most highly valued AI chip startups at the time. Total funding after the Series D exceeded $1.1 billion.
In September 2022, SambaNova launched its second-generation system, the DataScale SN30, powered by the Cardinal SN30 RDU. The SN30 used multi-die packaging to double compute capacity by combining two RDU dies into a single socket. Built on TSMC's 7nm process, the SN30 contained 86 billion transistors and delivered 688 teraflops at bfloat16 precision. SambaNova reported that the SN30 achieved a 6x speedup over comparable NVIDIA A100-based systems on certain training workloads.
In September 2023, SambaNova unveiled the SN40L, its fourth-generation RDU built on TSMC's 5nm process. The SN40L introduced a three-tier memory hierarchy combining 520 MiB of on-chip PMU SRAM, 64 GiB of co-packaged HBM, and up to 1.5 TiB of DDR DRAM via pluggable DIMMs. This design allowed the chip to hold much larger models in memory than competing architectures and was specifically co-designed for both training and inference of large foundation models. The SN40L featured 1,040 Pattern Compute Units (PCUs), delivered 638 BF16 TFLOPS per socket, and could handle models with up to 5 trillion parameters.
This period also marked a strategic shift in the company's focus. While SambaNova originally positioned itself as a training platform, the rapid growth of generative AI and the explosion of inference demand led the company to reposition around high-performance AI inference as its primary market.
On September 11, 2024, SambaNova launched SambaNova Cloud, a cloud-based AI inference service powered by the SN40L. At launch, the platform offered Meta's Llama 3.1 405B model at 132 output tokens per second and the 70B variant at 461 tokens per second at full precision. Artificial Analysis independently verified these speeds, confirming them as the fastest available from any cloud API provider at the time, faster than endpoints offered by OpenAI, Anthropic, and Google.
During 2025, reports emerged that SambaNova had struggled to close a new funding round as competition in the AI chip market intensified. The company's private market valuation had declined significantly from its 2021 peak. In late 2025, Intel reportedly entered advanced acquisition discussions to buy SambaNova for approximately $1.6 billion, including debt. Those talks ultimately did not result in a deal.
In February 2026, SambaNova secured a $350 million Series E round led by Vista Equity Partners and Cambium Capital. Intel Capital, T. Rowe Price, BlackRock, and Battery Ventures also participated. As part of the announcement, SambaNova and Intel established a multi-year collaboration to co-develop AI inference systems combining SambaNova's RDU accelerators with Intel's Xeon CPUs. Intel plans to co-market and co-sell the resulting platforms through its enterprise sales channels. This round brought SambaNova's total funding to approximately $1.5 billion.
Alongside the funding announcement, SambaNova unveiled the SN50, its fifth-generation RDU purpose-built for agentic AI inference, with shipments expected later in 2026. SoftBank has signed on as one of the first customers for the SN50.
At the core of SambaNova's technology is its Reconfigurable Dataflow Architecture (RDA), which takes a fundamentally different approach from the SIMT (Single Instruction, Multiple Threads) model used by GPUs. In a traditional GPU, a program counter drives execution across thousands of threads that process data stored in a memory hierarchy. In SambaNova's dataflow model, data flows through a network of reconfigurable compute and memory units, with the computation defined by the structure of the graph rather than a sequential instruction stream.
The RDA is built around three primary on-chip components:
| Component | Function |
|---|---|
| Pattern Compute Units (PCUs) | Perform arithmetic and logic operations. Each PCU contains six SIMD stages with 16 lanes each, providing 96 functional units per PCU. |
| Pattern Memory Units (PMUs) | Serve as distributed on-chip SRAM for storing model weights, activations, and intermediate data close to computation. |
| Switches | Form a high-speed, three-dimensional on-chip switching fabric that routes data between PCUs and PMUs with minimal latency. |
This architecture allows the compiler to map the dataflow graph of a neural network directly onto the hardware, eliminating many of the inefficiencies associated with GPU kernel launches, memory copies, and thread synchronization. Because the units are reconfigurable, the same physical hardware can be reprogrammed for different model architectures without the fixed-function limitations of traditional ASICs.
SambaNova has released multiple generations of its Reconfigurable Dataflow Unit (RDU), each building on the dataflow architecture with improvements in process technology, memory capacity, and compute density.
| Generation | Chip | Process Node | Key Specifications | Year |
|---|---|---|---|---|
| 1st | SN10 | TSMC 7nm | 640 PCUs, 640 PMUs, 320 MB SRAM, 300+ BF16 TFLOPS, 40B transistors, up to 1.5 TB DDR | 2020 |
| 2nd | SN30 | TSMC 7nm | Multi-die (2x SN10 dies), 86B transistors, 688 BF16 TFLOPS | 2022 |
| 4th | SN40L | TSMC 5nm | 1,040 PCUs, 520 MiB SRAM, 64 GiB HBM, up to 1.5 TiB DDR, 638 BF16 TFLOPS, 10.2 PFLOPS per rack | 2023 |
| 5th | SN50 | Not disclosed | 432 MB SRAM, 64 GB HBM2E (1.8 TB/s), 256 GB to 2 TB DDR5, 1.6 PFLOPS FP16, 3.2 PFLOPS FP8, up to 256-chip scaling | 2026 |
The SN10 was presented at Hot Chips 33 in August 2021, where SambaNova published a detailed IEEE paper on its 7nm dataflow architecture. The SN40L was the subject of a 2024 research paper on arXiv describing its approach to scaling the "AI memory wall" through its three-tier memory system and Composition of Experts.
The SN50, announced in February 2026, delivers 2.5x the 16-bit floating-point performance and 5x the FP8 performance of the SN40L. A single SambaRack SN50 combines 16 SN50 chips and averages approximately 20 kW of power. Up to 256 accelerators can be interconnected across multiple racks using a multi-terabit-per-second switched fabric with 2.2 TB/s of bidirectional chip-to-chip bandwidth. The SN50 supports models with up to 10 trillion parameters and context lengths of up to 10 million tokens.
SambaFlow is SambaNova's compiler and runtime software stack that bridges standard ML frameworks and the RDU hardware. Developers write models in PyTorch or TensorFlow, and SambaFlow automatically extracts the computational graph, optimizes it for dataflow execution, and maps it onto the RDU's PCUs and PMUs.
The SambaFlow stack includes:
Because the compiler handles low-level optimization, developers do not need to write custom CUDA kernels or hand-tune memory layouts, which SambaNova positions as an advantage over GPU-based workflows that often require significant kernel engineering for peak performance.
The DataScale system is SambaNova's on-premises hardware platform, designed as a turnkey appliance for enterprise and government customers. Each DataScale node consists of a host server module and multiple RDU accelerators connected through high-bandwidth interconnects.
| System | RDUs per Node | Memory per Node | Form Factor |
|---|---|---|---|
| DataScale SN10-8 | 8x SN10 | 3 TB, 6 TB, or 12 TB | Quarter rack |
| DataScale SN30 | Multi-die SN30 | Expanded capacity | Rack-scale |
| DataScale SN40L | SN40L accelerators | HBM + DDR tiers | Rack-scale |
DataScale systems are sold outright or offered through SambaNova's "Dataflow-as-a-Service" model, which provides access via a cloud service provider or on-premises private cloud deployment. The systems come pre-loaded with the SambaFlow software stack and SambaStudio management platform.
SambaStudio is SambaNova's management and orchestration platform with a graphical interface for training, fine-tuning, deploying, and managing AI models. Features include:
The Composition of Experts (CoE) is a model architecture developed by SambaNova that allows multiple specialized, fully trained models ("experts") to be orchestrated behind a single API endpoint. Unlike a mixture of experts architecture where expert sub-networks exist within a single model, CoE treats each expert as a standalone model and routes queries to the appropriate expert based on the task.
Key characteristics of CoE include:
SambaNova released Samba-CoE v0.1 and subsequent versions as demonstrations of this architecture, showing that a collection of smaller specialized models can match or exceed the performance of much larger general-purpose models on domain-specific tasks.
SambaNova Cloud is the company's cloud-hosted inference service, launched in September 2024. It provides API access to popular open-source large language models running on SambaNova's SN40L hardware, with an emphasis on inference speed.
The service offers an OpenAI-compatible API, making integration straightforward for developers familiar with that interface. Available models have included Meta's Llama family, DeepSeek R1 and V3, Alibaba's Qwen models, and others.
Notable speed benchmarks achieved on SambaNova Cloud:
| Model | Output Speed (tokens/sec) |
|---|---|
| Llama 3.1 8B | 1,000+ |
| Llama 3.1 70B | 461-580 |
| Llama 3.1 405B | 100-132 |
| DeepSeek R1 671B | 255 |
| DeepSeek V3-0324 | 250 |
| MiniMax M2 (230B) | 378 |
SambaNova Cloud is available in three tiers: a Free Tier with $5 in starting credits (approximately 30 million tokens on Llama 8B), a Developer Tier with pay-as-you-go billing, and an Enterprise Tier with dedicated support and custom configurations.
SambaNova has established a significant presence in the U.S. national laboratory system, making it one of the most widely deployed AI chip startups in government research facilities.
| Institution | Deployment Details |
|---|---|
| Argonne National Laboratory | Deployed DataScale systems as part of the AI Testbed at the Argonne Leadership Computing Facility. Used to train convolutional neural networks on images exceeding 50,000 x 50,000 pixel resolution, a task that was infeasible on GPU-based systems due to memory constraints. |
| Lawrence Livermore National Laboratory (LLNL) | Expanded collaboration announced in May 2023 to bring SambaNova's spatial dataflow accelerators into LLNL's Computing Center, supporting the lab's cognitive simulation program for improving speed and accuracy of scientific research. |
| Los Alamos National Laboratory | Deployed SambaNova systems for AI-driven scientific computing workloads. |
| Oak Ridge National Laboratory | Integrated SambaNova hardware for AI research applications. |
| Texas Advanced Computing Center (TACC) | Selected SambaNova AI systems to accelerate scientific research across multiple disciplines. |
Beyond government laboratories, SambaNova has deployed systems and services to a range of enterprise customers:
SambaNova has completed five major funding rounds since its founding:
| Round | Date | Amount | Lead Investor(s) | Post-Money Valuation |
|---|---|---|---|---|
| Series A | March 2018 | $56 million | GV | Not disclosed |
| Series B | April 2019 | $150 million | Intel Capital | Not disclosed |
| Series C | February 2020 | $250 million | BlackRock | Not disclosed |
| Series D | April 2021 | $676 million | SoftBank Vision Fund 2 | $5.1 billion |
| Series E | February 2026 | $350 million | Vista Equity Partners, Cambium Capital | Not disclosed |
| Total | ~$1.5 billion |
Notable investors across all rounds include SoftBank Vision Fund, Vista Equity Partners, Intel Capital, GV (Google Ventures), BlackRock, T. Rowe Price, Battery Ventures, Temasek, GIC, Walden International, WRVI Capital, Redline Capital, Atlantic Bridge, and Cambium Capital.
The company's peak valuation of over $5 billion came with its 2021 Series D. By late 2025, its private market valuation had declined to approximately $1.6 billion amid increased competition and a challenging fundraising environment for AI hardware startups. The February 2026 Series E, along with the Intel partnership, provided a path forward without requiring an outright sale.
SambaNova operates in a highly competitive market for AI inference and training hardware. Its primary competitors include both established semiconductor companies and venture-backed startups.
| Company | Architecture | Key Differentiator |
|---|---|---|
| NVIDIA | GPU (CUDA) | Dominant market position with 80%+ share in AI accelerators. H100 and B200 GPUs are the industry standard. Massive software ecosystem with CUDA. |
| Groq | LPU (Language Processing Unit) | Deterministic, single-core architecture optimized for inference latency. NVIDIA announced a $20 billion deal to license Groq's technology in December 2025. |
| Cerebras Systems | Wafer-Scale Engine (WSE) | Largest chip ever built, using an entire silicon wafer. OpenAI announced a partnership with Cerebras in January 2026 for 750 MW of compute. |
| AMD | GPU (ROCm) | MI300X accelerator competing directly with NVIDIA on price-performance for inference. |
| TPU | Custom ASIC used internally and offered through Google Cloud. | |
| Intel | Gaudi / Xeon | Gaudi accelerators for AI training and inference; now partnering with SambaNova rather than competing directly. |
SambaNova differentiates itself through its dataflow architecture, which avoids the kernel-launch overhead and memory bottlenecks common in GPU-based inference. The company's three-tier memory hierarchy (SRAM, HBM, DDR) allows it to keep many large models resident in memory simultaneously, which is particularly advantageous for multi-model serving and agentic AI workloads that require rapid switching between different models.
The AI inference market is projected to grow from approximately $106 billion in 2025 to $255 billion by 2030, though NVIDIA, AMD, and cloud providers' custom ASICs are expected to retain 80-95% of market share. SambaNova, Groq, and Cerebras collectively target the remaining segment with specialized architectures that offer advantages in speed, efficiency, or scale for specific workload types.
| Name | Title | Background |
|---|---|---|
| Rodrigo Liang | Co-Founder and CEO | Stanford BS/MS in Electrical Engineering. Former SVP of SPARC Processor and ASIC Development at Oracle. Over 15 years at Sun Microsystems/Oracle leading chip design teams. |
| Kunle Olukotun | Co-Founder and Chief Technologist | Cadence Design Professor of Electrical Engineering and Computer Science at Stanford. Pioneer of multicore processor design. Founded Afara Websystems (acquired by Sun Microsystems). ACM Fellow, IEEE Fellow, member of the National Academy of Engineering. Recipient of the 2023 ACM-IEEE CS Eckert-Mauchly Award. |
| Christopher Ré | Co-Founder | Associate Professor of Computer Science at Stanford. Leader of the Hazy Research group. Co-founder of Snorkel AI and two companies acquired by Apple (Lattice/DeepDive and Inductiv/HoloClean). MacArthur Fellow. |