SambaNova Systems

SambaNova Systems is an American artificial intelligence hardware and software company headquartered in Palo Alto, California. Founded in 2017 by Stanford University professors Kunle Olukotun and Christopher Ré along with semiconductor industry veteran Rodrigo Liang, the company designs custom AI processors called Reconfigurable Dataflow Units (RDUs) and builds full-stack platforms for training and inference of large language models. SambaNova has raised over $1.4 billion in venture funding from investors including SoftBank Vision Fund, Intel Capital, Vista Equity Partners, BlackRock, and GV. The company competes with NVIDIA, Groq, and Cerebras Systems in the AI accelerator market.

History

Founding and Early Years (2017-2019)

SambaNova Systems was incorporated in November 2017 by three co-founders with deep roots in processor architecture and machine learning research at Stanford University. Kunle Olukotun, a professor of electrical engineering and computer science at Stanford, is widely recognized as a pioneer of multicore processor design. His earlier company, Afara Websystems, developed the Niagara multi-core chip that was acquired by Sun Microsystems in 2002 and became the foundation of Oracle's SPARC server line. Christopher Ré, an associate professor of computer science at Stanford and leader of the Hazy Research group, brought expertise in machine learning systems and data management. Ré is also known for co-founding Snorkel AI and two other startups later acquired by Apple. Rodrigo Liang, who serves as CEO, previously spent over 15 years at Sun Microsystems and Oracle, where he rose to Senior Vice President of SPARC Processor and ASIC Development and oversaw the release of 12 major processor designs.

The founding thesis behind SambaNova was that the dominant computing architectures of the time, originally designed for general-purpose workloads, were fundamentally ill-suited for the dataflow patterns of deep learning. Rather than adapting existing GPU or CPU designs, the co-founders set out to build an entirely new processor architecture from scratch, optimized specifically for AI and machine learning operations.

The company operated in stealth mode during its first year while assembling a team of chip architects, compiler engineers, and ML researchers. In March 2018, SambaNova emerged from stealth with a $56 million Series A round led by GV (formerly Google Ventures), with participation from Walden International, Redline Capital Management, and Atlantic Bridge.

Growth and Product Development (2019-2021)

In April 2019, SambaNova closed a $150 million Series B round led by Intel Capital, with continued participation from GV and existing investors. The funding accelerated development of the company's first-generation chip, the SN10 RDU, which was taped out on TSMC's 7nm process in the first half of 2019.

The company's first commercial product, the DataScale SN10 system, began shipping to customers in early 2020. In February 2020, SambaNova announced a $250 million Series C round led by funds and accounts managed by BlackRock, with participation from GV, Intel Capital, Walden International, WRVI Capital, and Redline Capital.

By 2021, SambaNova had secured deployments at several U.S. Department of Energy national laboratories. In April 2021, the company raised a $676 million Series D round led by SoftBank Vision Fund 2, with participation from new investors Temasek and GIC alongside existing backers including BlackRock, Intel Capital, GV, and Walden International. This round valued SambaNova at over $5 billion, making it one of the most highly valued AI chip startups at the time. Total funding after the Series D exceeded $1.1 billion.

Strategic Pivot to Inference (2022-2024)

In September 2022, SambaNova launched its second-generation system, the DataScale SN30, powered by the Cardinal SN30 RDU. The SN30 used multi-die packaging to double compute capacity by combining two RDU dies into a single socket. Built on TSMC's 7nm process, the SN30 contained 86 billion transistors and delivered 688 teraflops at bfloat16 precision. SambaNova reported that the SN30 achieved a 6x speedup over comparable NVIDIA A100-based systems on certain training workloads.

In September 2023, SambaNova unveiled the SN40L, its fourth-generation RDU built on TSMC's 5nm process. The SN40L introduced a three-tier memory hierarchy combining 520 MiB of on-chip PMU SRAM, 64 GiB of co-packaged HBM, and up to 1.5 TiB of DDR DRAM via pluggable DIMMs. This design allowed the chip to hold much larger models in memory than competing architectures and was specifically co-designed for both training and inference of large foundation models. The SN40L featured 1,040 Pattern Compute Units (PCUs), delivered 638 BF16 TFLOPS per socket, and could handle models with up to 5 trillion parameters.

This period also marked a strategic shift in the company's focus. While SambaNova originally positioned itself as a training platform, the rapid growth of generative AI and the explosion of inference demand led the company to reposition around high-performance AI inference as its primary market.

On September 11, 2024, SambaNova launched SambaNova Cloud, a cloud-based AI inference service powered by the SN40L. At launch, the platform offered Meta's Llama 3.1 405B model at 132 output tokens per second and the 70B variant at 461 tokens per second at full precision. Artificial Analysis independently verified these speeds, confirming them as the fastest available from any cloud API provider at the time, faster than endpoints offered by OpenAI, Anthropic, and Google.

Funding Challenges and Intel Partnership (2025-2026)

During 2025, reports emerged that SambaNova had struggled to close a new funding round as competition in the AI chip market intensified. The company's private market valuation had declined significantly from its 2021 peak. In late 2025, Intel reportedly entered advanced acquisition discussions to buy SambaNova for approximately $1.6 billion, including debt. Those talks ultimately did not result in a deal.

In February 2026, SambaNova secured a $350 million Series E round led by Vista Equity Partners and Cambium Capital. Intel Capital, T. Rowe Price, BlackRock, and Battery Ventures also participated. As part of the announcement, SambaNova and Intel established a multi-year collaboration to co-develop AI inference systems combining SambaNova's RDU accelerators with Intel's Xeon CPUs. Intel plans to co-market and co-sell the resulting platforms through its enterprise sales channels. This round brought SambaNova's total funding to approximately $1.5 billion.

Alongside the funding announcement, SambaNova unveiled the SN50, its fifth-generation RDU purpose-built for agentic AI inference, with shipments expected later in 2026. SoftBank has signed on as one of the first customers for the SN50.

Technology

Reconfigurable Dataflow Architecture (RDA)

At the core of SambaNova's technology is its Reconfigurable Dataflow Architecture (RDA), which takes a fundamentally different approach from the SIMT (Single Instruction, Multiple Threads) model used by GPUs. In a traditional GPU, a program counter drives execution across thousands of threads that process data stored in a memory hierarchy. In SambaNova's dataflow model, data flows through a network of reconfigurable compute and memory units, with the computation defined by the structure of the graph rather than a sequential instruction stream.

The RDA is built around three primary on-chip components:

Component	Function
Pattern Compute Units (PCUs)	Perform arithmetic and logic operations. Each PCU contains six SIMD stages with 16 lanes each, providing 96 functional units per PCU.
Pattern Memory Units (PMUs)	Serve as distributed on-chip SRAM for storing model weights, activations, and intermediate data close to computation.
Switches	Form a high-speed, three-dimensional on-chip switching fabric that routes data between PCUs and PMUs with minimal latency.

This architecture allows the compiler to map the dataflow graph of a neural network directly onto the hardware, eliminating many of the inefficiencies associated with GPU kernel launches, memory copies, and thread synchronization. Because the units are reconfigurable, the same physical hardware can be reprogrammed for different model architectures without the fixed-function limitations of traditional ASICs.

RDU Chip Generations

SambaNova has released multiple generations of its Reconfigurable Dataflow Unit (RDU), each building on the dataflow architecture with improvements in process technology, memory capacity, and compute density.

Generation	Chip	Process Node	Key Specifications	Year
1st	SN10	TSMC 7nm	640 PCUs, 640 PMUs, 320 MB SRAM, 300+ BF16 TFLOPS, 40B transistors, up to 1.5 TB DDR	2020
2nd	SN30	TSMC 7nm	Multi-die (2x SN10 dies), 86B transistors, 688 BF16 TFLOPS	2022
4th	SN40L	TSMC 5nm	1,040 PCUs, 520 MiB SRAM, 64 GiB HBM, up to 1.5 TiB DDR, 638 BF16 TFLOPS, 10.2 PFLOPS per rack	2023
5th	SN50	Not disclosed	432 MB SRAM, 64 GB HBM2E (1.8 TB/s), 256 GB to 2 TB DDR5, 1.6 PFLOPS FP16, 3.2 PFLOPS FP8, up to 256-chip scaling	2026

The SN10 was presented at Hot Chips 33 in August 2021, where SambaNova published a detailed IEEE paper on its 7nm dataflow architecture. The SN40L was the subject of a 2024 research paper on arXiv describing its approach to scaling the "AI memory wall" through its three-tier memory system and Composition of Experts.

The SN50, announced in February 2026, delivers 2.5x the 16-bit floating-point performance and 5x the FP8 performance of the SN40L. A single SambaRack SN50 combines 16 SN50 chips and averages approximately 20 kW of power. Up to 256 accelerators can be interconnected across multiple racks using a multi-terabit-per-second switched fabric with 2.2 TB/s of bidirectional chip-to-chip bandwidth. The SN50 supports models with up to 10 trillion parameters and context lengths of up to 10 million tokens.

SambaFlow Software Stack

SambaFlow is SambaNova's compiler and runtime software stack that bridges standard ML frameworks and the RDU hardware. Developers write models in PyTorch or TensorFlow, and SambaFlow automatically extracts the computational graph, optimizes it for dataflow execution, and maps it onto the RDU's PCUs and PMUs.

The SambaFlow stack includes:

SambaFlow Compiler: Translates high-level model definitions into RDU configuration sequences, handling tiling, weight and data partitioning, and flow control automatically.
SambaFlow Runtime: Manages communication with the DataScale hardware, including hardware initialization, error handling, resource management, and process scheduling.
SambaFlow Python SDK: Provides a developer-facing API for creating, compiling, and running models on the RDU.

Because the compiler handles low-level optimization, developers do not need to write custom CUDA kernels or hand-tune memory layouts, which SambaNova positions as an advantage over GPU-based workflows that often require significant kernel engineering for peak performance.

Products and Services

SambaNova DataScale

The DataScale system is SambaNova's on-premises hardware platform, designed as a turnkey appliance for enterprise and government customers. Each DataScale node consists of a host server module and multiple RDU accelerators connected through high-bandwidth interconnects.

System	RDUs per Node	Memory per Node	Form Factor
DataScale SN10-8	8x SN10	3 TB, 6 TB, or 12 TB	Quarter rack
DataScale SN30	Multi-die SN30	Expanded capacity	Rack-scale
DataScale SN40L	SN40L accelerators	HBM + DDR tiers	Rack-scale

DataScale systems are sold outright or offered through SambaNova's "Dataflow-as-a-Service" model, which provides access via a cloud service provider or on-premises private cloud deployment. The systems come pre-loaded with the SambaFlow software stack and SambaStudio management platform.

SambaStudio

SambaStudio is SambaNova's management and orchestration platform with a graphical interface for training, fine-tuning, deploying, and managing AI models. Features include:

Model Hub: A catalog of pre-trained and fine-tuned models available for deployment, with filtering and search capabilities.
Training Jobs: Users can launch training runs and fine-tuning jobs on the DataScale hardware, with support for distributed training across multiple RDUs.
Endpoint Deployment: Models can be deployed as API endpoints for real-time inference, with configurable scaling and access controls.
Batch Inference: Support for running predictions over large datasets in batch mode.
Role-Based Access Control: Multi-tenant support with predefined roles for managing users, resources, and model permissions within an organization.

Composition of Experts (CoE)

The Composition of Experts (CoE) is a model architecture developed by SambaNova that allows multiple specialized, fully trained models ("experts") to be orchestrated behind a single API endpoint. Unlike a mixture of experts architecture where expert sub-networks exist within a single model, CoE treats each expert as a standalone model and routes queries to the appropriate expert based on the task.

Key characteristics of CoE include:

Multi-Model Serving: Hundreds of domain-specific models (for example, finance, legal, engineering, and healthcare) can reside in the RDU's large memory tiers and be served from a single system.
Dynamic Routing: An orchestration layer routes incoming requests to the best-suited expert model, enabling broad coverage without requiring a single monolithic model.
Model Ownership and Privacy: Organizations can fine-tune their own expert models with proprietary data while maintaining ownership and access controls.
Efficient Memory Utilization: The RDU's three-tier memory hierarchy allows many models to be loaded simultaneously, avoiding the latency of swapping models in and out of GPU memory.

SambaNova released Samba-CoE v0.1 and subsequent versions as demonstrations of this architecture, showing that a collection of smaller specialized models can match or exceed the performance of much larger general-purpose models on domain-specific tasks.

SambaNova Cloud

SambaNova Cloud is the company's cloud-hosted inference service, launched in September 2024. It provides API access to popular open-source large language models running on SambaNova's SN40L hardware, with an emphasis on inference speed.

The service offers an OpenAI-compatible API, making integration straightforward for developers familiar with that interface. Available models have included Meta's Llama family, DeepSeek R1 and V3, Alibaba's Qwen models, and others.

Notable speed benchmarks achieved on SambaNova Cloud:

Model	Output Speed (tokens/sec)
Llama 3.1 8B	1,000+
Llama 3.1 70B	461-580
Llama 3.1 405B	100-132
DeepSeek R1 671B	255
DeepSeek V3-0324	250
MiniMax M2 (230B)	378

SambaNova Cloud is available in three tiers: a Free Tier with $5 in starting credits (approximately 30 million tokens on Llama 8B), a Developer Tier with pay-as-you-go billing, and an Enterprise Tier with dedicated support and custom configurations.

Enterprise and Government Deployments

SambaNova has established a significant presence in the U.S. national laboratory system, making it one of the most widely deployed AI chip startups in government research facilities.

National Laboratory Partnerships

Institution	Deployment Details
Argonne National Laboratory	Deployed DataScale systems as part of the AI Testbed at the Argonne Leadership Computing Facility. Used to train convolutional neural networks on images exceeding 50,000 x 50,000 pixel resolution, a task that was infeasible on GPU-based systems due to memory constraints.
Lawrence Livermore National Laboratory (LLNL)	Expanded collaboration announced in May 2023 to bring SambaNova's spatial dataflow accelerators into LLNL's Computing Center, supporting the lab's cognitive simulation program for improving speed and accuracy of scientific research.
Los Alamos National Laboratory	Deployed SambaNova systems for AI-driven scientific computing workloads.
Oak Ridge National Laboratory	Integrated SambaNova hardware for AI research applications.
Texas Advanced Computing Center (TACC)	Selected SambaNova AI systems to accelerate scientific research across multiple disciplines.

Enterprise Customers

Beyond government laboratories, SambaNova has deployed systems and services to a range of enterprise customers:

Accenture: Technology consulting partnership for enterprise AI deployments.
Aramco: Energy sector AI applications.
NetApp: Data infrastructure and storage-related AI workloads.
OTP Bank: Financial services AI applications.
SoftBank: Strategic customer and investor, signed on as one of the first SN50 customers.
RIKEN Center for Computational Science (Japan): Scientific computing and research applications.

Funding History

SambaNova has completed five major funding rounds since its founding:

Round	Date	Amount	Lead Investor(s)	Post-Money Valuation
Series A	March 2018	$56 million	GV	Not disclosed
Series B	April 2019	$150 million	Intel Capital	Not disclosed
Series C	February 2020	$250 million	BlackRock	Not disclosed
Series D	April 2021	$676 million	SoftBank Vision Fund 2	$5.1 billion
Series E	February 2026	$350 million	Vista Equity Partners, Cambium Capital	Not disclosed
Total		~$1.5 billion

Notable investors across all rounds include SoftBank Vision Fund, Vista Equity Partners, Intel Capital, GV (Google Ventures), BlackRock, T. Rowe Price, Battery Ventures, Temasek, GIC, Walden International, WRVI Capital, Redline Capital, Atlantic Bridge, and Cambium Capital.

The company's peak valuation of over $5 billion came with its 2021 Series D. By late 2025, its private market valuation had declined to approximately $1.6 billion amid increased competition and a challenging fundraising environment for AI hardware startups. The February 2026 Series E, along with the Intel partnership, provided a path forward without requiring an outright sale.

Competitive Landscape

SambaNova operates in a highly competitive market for AI inference and training hardware. Its primary competitors include both established semiconductor companies and venture-backed startups.

Company	Architecture	Key Differentiator
NVIDIA	GPU (CUDA)	Dominant market position with 80%+ share in AI accelerators. H100 and B200 GPUs are the industry standard. Massive software ecosystem with CUDA.
Groq	LPU (Language Processing Unit)	Deterministic, single-core architecture optimized for inference latency. NVIDIA announced a $20 billion deal to license Groq's technology in December 2025.
Cerebras Systems	Wafer-Scale Engine (WSE)	Largest chip ever built, using an entire silicon wafer. OpenAI announced a partnership with Cerebras in January 2026 for 750 MW of compute.
AMD	GPU (ROCm)	MI300X accelerator competing directly with NVIDIA on price-performance for inference.
Google	TPU	Custom ASIC used internally and offered through Google Cloud.
Intel	Gaudi / Xeon	Gaudi accelerators for AI training and inference; now partnering with SambaNova rather than competing directly.

SambaNova differentiates itself through its dataflow architecture, which avoids the kernel-launch overhead and memory bottlenecks common in GPU-based inference. The company's three-tier memory hierarchy (SRAM, HBM, DDR) allows it to keep many large models resident in memory simultaneously, which is particularly advantageous for multi-model serving and agentic AI workloads that require rapid switching between different models.

The AI inference market is projected to grow from approximately $106 billion in 2025 to $255 billion by 2030, though NVIDIA, AMD, and cloud providers' custom ASICs are expected to retain 80-95% of market share. SambaNova, Groq, and Cerebras collectively target the remaining segment with specialized architectures that offer advantages in speed, efficiency, or scale for specific workload types.

Leadership

Name	Title	Background
Rodrigo Liang	Co-Founder and CEO	Stanford BS/MS in Electrical Engineering. Former SVP of SPARC Processor and ASIC Development at Oracle. Over 15 years at Sun Microsystems/Oracle leading chip design teams.
Kunle Olukotun	Co-Founder and Chief Technologist	Cadence Design Professor of Electrical Engineering and Computer Science at Stanford. Pioneer of multicore processor design. Founded Afara Websystems (acquired by Sun Microsystems). ACM Fellow, IEEE Fellow, member of the National Academy of Engineering. Recipient of the 2023 ACM-IEEE CS Eckert-Mauchly Award.
Christopher Ré	Co-Founder	Associate Professor of Computer Science at Stanford. Leader of the Hazy Research group. Co-founder of Snorkel AI and two companies acquired by Apple (Lattice/DeepDive and Inductiv/HoloClean). MacArthur Fellow.

References

SambaNova Systems. "SambaNova Raises $676M in Series D, Surpasses $5B Valuation and Becomes World's Best-Funded AI Startup." BusinessWire, April 13, 2021.
SambaNova Systems. "SN10 RDU: A 7nm Dataflow Architecture to Accelerate Software 2.0." IEEE International Solid-State Circuits Conference (ISSCC), 2022.
Prabhakar, R., et al. "SambaNova SN40L: Scaling the AI Memory Wall with Dataflow and Composition of Experts." arXiv:2405.07518, May 2024.
SambaNova Systems. "SambaNova Launches the World's Fastest AI Platform." BusinessWire, September 10, 2024.
SambaNova Systems. "Introducing the SN50 RDU: Purpose-Built for Agentic Inference." SambaNova Blog, February 24, 2026.
The Register. "SambaNova raises $350M with Intel backing." February 24, 2026.
SiliconANGLE. "SambaNova steps up its challenge to Nvidia with new chip, $350M funding and a powerful ally in Intel." February 24, 2026.
SambaNova Systems. "SambaNova and Lawrence Livermore National Laboratory Scale Up Collaboration to Accelerate AI for Science." BusinessWire, May 22, 2023.
SambaNova Systems. "Accelerating Scientific Applications With SambaNova Reconfigurable Dataflow Architecture." IEEE Transactions, 2021.
HPCwire. "SambaNova Launches Second-Gen DataScale System." September 14, 2022.
ServeTheHome. "SambaNova SN10 RDU at Hot Chips 33." August 2021.
ACM. "Pioneer of Multicore Processor Design Receives the ACM-IEEE CS Eckert-Mauchly Award." June 2023.
SambaNova Systems. "Composition of Experts." sambanova.ai/technology/composition-of-experts.
Artificial Analysis. "DeepSeek R1 Update." artificialanalysis.ai, 2025.
Tom's Hardware. "Sambanova introduces new AI accelerator, partners with Intel to deploy Xeon CPUs for inferencing and agentic workloads." February 2026.

History

Founding and Early Years (2017-2019)

Growth and Product Development (2019-2021)

Strategic Pivot to Inference (2022-2024)

Funding Challenges and Intel Partnership (2025-2026)

Technology

Reconfigurable Dataflow Architecture (RDA)

RDU Chip Generations

SambaFlow Software Stack

Products and Services

SambaNova DataScale

SambaStudio

Composition of Experts (CoE)

SambaNova Cloud

Enterprise and Government Deployments

National Laboratory Partnerships

Enterprise Customers

Funding History

Competitive Landscape

Leadership

See Also

References

Improve this article

Related Articles

Groq

Sparse autoencoder

GELU (Gaussian Error Linear Unit)

LeNet

DoorDash

Safe Superintelligence Inc

History

Founding and Early Years (2017-2019)

Growth and Product Development (2019-2021)

Strategic Pivot to Inference (2022-2024)

Funding Challenges and Intel Partnership (2025-2026)

Technology

Reconfigurable Dataflow Architecture (RDA)

RDU Chip Generations

SambaFlow Software Stack

Products and Services

SambaNova DataScale

SambaStudio

Composition of Experts (CoE)

SambaNova Cloud

Enterprise and Government Deployments

National Laboratory Partnerships

Enterprise Customers

Funding History

Competitive Landscape

Leadership

See Also

References

Related Articles

Groq

Sparse autoencoder

GELU (Gaussian Error Linear Unit)

LeNet

DoorDash

Safe Superintelligence Inc