Positron AI

AI Companies AI Hardware AI Inference

9 min read

Updated Jun 7, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 7, 2026

Fact-checked

In review queue

Sources

15 citations

Revision

v3 · 1,755 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Positron AI is an American semiconductor startup headquartered in Reno, Nevada, that designs and manufactures purpose-built hardware for transformer inference. Founded in 2023 by chief executive Mitesh Agrawal and chief technology officer Thomas Sohmers, the company markets a memory-optimized appliance called Atlas that it positions as a more power-efficient alternative to general-purpose Nvidia GPUs for serving large language models.^[1]^[2] Positron raised a $230 million Series B in February 2026 at a valuation above $1 billion, bringing total disclosed funding to roughly $305 million and making the three-year-old company an AI-hardware "unicorn."^[3]^[4]

Background

Positron was founded in the spring of 2023 with a thesis that mainstream GPUs are over-provisioned for transformer inference workloads, where the ratio of compute to memory operations is close to 1:1 and overall throughput is bottlenecked by memory bandwidth rather than arithmetic.^[5] Rather than chase a general-purpose accelerator, the company designed hardware exclusively around the matrix and attention operations of transformer models, with the explicit goal of running the largest open and proprietary LLMs at the lowest cost-per-token and within power envelopes compatible with conventional air-cooled data centers.^[5]^[6]

The company is headquartered in Reno, Nevada, and emphasizes that its hardware is "designed, fabricated, and assembled in the U.S.," with the first-generation silicon fabricated in Arizona.^[4]^[7] In its own published timeline, Positron operated with fewer than ten employees through the first eight months of its existence, when it produced its initial Llama-2 7B prototype on an FPGA platform; by Month 15 it had shipped the production Atlas system, and by early 2026 it had completed two major funding rounds.^[6]

Founders and leadership

Thomas Sohmers, co-founder and chief technology officer, is a serial processor architect who received a Thiel Fellowship in 2013 at age 17 to start REX Computing, a fabless semiconductor company that designed energy-efficient processors for high-performance computing and DSP workloads. He also co-founded a cryptocurrency-ASIC venture called Terapute Technologies and later served as Director of Technology Strategy and Head of Distributed Systems at Groq, where he worked on data-center-scale machine-learning systems before founding Positron.^[8]

Mitesh Agrawal, chief executive officer, joined Positron as CEO in early 2025 after spending roughly eight and a half years at Lambda (formerly Lambda Labs), most recently as chief operating officer. At Lambda he is credited with helping scale the GPU-cloud business from approximately $500,000 to about $500 million in annualized revenue run-rate and with leading several of the company's later funding rounds.^[9]^[10] His move from a Nvidia-aligned neocloud to a Nvidia competitor was widely interpreted as a signal that demand for non-Nvidia inference silicon had matured beyond a niche.^[10]

Edward Kmett serves as Chief Scientist. He is identified in Positron's Series A announcement alongside Sohmers and Agrawal as part of the founding leadership team.^[11]

Positron's own materials describe the founding group as "a visionary, an applied mathematician, and an engineer," and disclose that the company recruited Agrawal as its new chief executive at Month 21 of operation, replacing earlier leadership.^[6]

Product: Atlas

Architecture and philosophy

Atlas is Positron's first-generation product, marketed as a complete "Transformer Inference Server" rather than a bare chip.^[12] The system is built around eight Positron Archer transformer accelerators, each paired with 32 GB of HBM for a total of 256 GB of accelerator memory per server.^[12] The Archer accelerators are coupled to dual AMD EPYC Genoa 9374F host processors (64 cores total) and a 24-channel DDR5 memory subsystem standard at 384 GB and expandable to 2 TB, supporting models that spill outside HBM into host memory.^[12]

The first-generation Atlas chips use a memory-optimized, FPGA-based architecture, according to coverage of Positron's Series A round. Positron reports that this design achieves roughly 93% HBM bandwidth utilization, compared with a typical 10-30% utilization on GPU-based systems, and that a single 2-kilowatt Atlas server can host models with up to half a trillion parameters.^[11]^[13] The company's stated design principle is that transformer inference is fundamentally a memory-bandwidth problem, so silicon area and power should be spent moving weights into matrix engines rather than on the general-purpose flexibility of a GPU.^[5]

Atlas exposes an OpenAI-compatible API endpoint and is binary-compatible with Hugging Face Transformers model checkpoints, which Positron presents as a drop-in path from existing GPU deployments.^[11]

Performance claims

Positron publishes head-to-head comparisons against Nvidia H100 and H200 systems. On Llama 3.1 8B in BF16 (with no speculative decoding or paged attention), Positron reports approximately 280 tokens per second per user on a 2,000-watt Atlas server, compared with roughly 180 tokens per second per user from an eight-GPU Nvidia DGX H200 drawing about 5,900 watts.^[1]^[12] Translated into the company's headline metrics, Atlas claims roughly 3.08x performance-per-dollar and 4.54x performance-per-watt versus the DGX H200, with "3x lower latency in production workloads."^[12] Against the prior-generation H100, Positron has cited 3.5x better performance-per-dollar and up to 66% lower power consumption.^[11]^[13]

Form factor

A single Atlas chassis is 7"H x 19"W x 29.25"D, weighs roughly 100 pounds (45.4 kg), and is powered by dual 2,000-watt redundant Titanium-rated power supplies. It is air-cooled and does not require the liquid cooling or extreme rack power densities associated with current high-end GPU clusters, which Positron presents as enabling deployment in "hundreds of existing data centers" without facility upgrades.^[12]^[13]

Funding

Positron has disclosed three priced equity rounds, totaling roughly $305 million.^[3]

Seed (through 2024). Positron brought Atlas from concept to shipping product on approximately $12.5 million in seed financing over roughly 18 months, according to company statements.^[3]
Strategic / extension round (February 2025). Positron disclosed a $23.5 million investment from Incline Village-based Flume Ventures.^[7]
Series A (July 2025). The company closed an oversubscribed $51.6 million Series A co-led by Valor Equity Partners, Atreides Management, and DFJ Growth, with participation from Flume Ventures (an entity associated with Sun Microsystems co-founder Scott McNealy), Resilience Reserve, 1517 Fund, and Unless. Positron said the funds would extend Atlas deployments and accelerate development of a second-generation product targeted for 2026.^[11]
Series B (February 2026). Positron announced a $230 million Series B at a post-money valuation above $1 billion, co-led by Arena Private Wealth, Jump Trading, and Unless, with new strategic investment from the Qatar Investment Authority (QIA), Arm, and Helena. According to the company, Jump Trading first engaged Positron as a customer and elected to co-lead the round after evaluating the company's technology and roadmap.^[4]^[14]^[15] The round brought total capital raised to just over $300 million.^[4]^[14] Coverage framed the QIA participation as part of Qatar's broader push to fund sovereign AI infrastructure outside the Nvidia ecosystem.^[14]

Customers

Positron's first publicly named production customers are:

Cloudflare, which uses Atlas in its globally distributed, power-constrained edge data centers, where the air-cooled form factor and per-watt efficiency are particularly relevant.^[5]^[13]
Parasail, in conjunction with its SnapServe inference service.^[11]

Positron has also stated that Atlas is "already deployed in production environments" at additional unnamed enterprise customers and "leading neocloud providers."^[13]

Roadmap: Asimov and Titan

Positron's announced second-generation silicon is named Asimov, a custom ASIC (rather than the FPGA-based design of the first Atlas chips) for which the company has said it plans a tape-out in late 2026, with production targeted for early 2027.^[4]^[15] Asimov is designed to support up to 2 TB of high-speed memory per accelerator, and the next-generation system that houses it, Titan, is designed to provide 8 TB of memory per system at memory bandwidth comparable to Nvidia's Rubin-generation GPUs. Positron claims Titan will be able to hold models on the order of 16 trillion parameters in a single chassis.^[11]^[14] Announcing the Series B, chief executive Mitesh Agrawal stated that the next-generation chip "will deliver 5x more tokens per watt in our core workloads versus Nvidia's upcoming Rubin GPU," a vendor projection that had not been independently benchmarked as of early 2026.^[15] The Series B funding is explicitly earmarked to accelerate the Asimov/Titan roadmap.^[4]

Competition

Positron is one of a cohort of post-2020 startups attempting to take share from Nvidia in inference rather than training. Its closest analogues by architectural philosophy include:

Groq, whose LPU uses deterministic compiler-scheduled execution and on-chip SRAM rather than HBM, optimizing for token latency on hosted endpoints.
Etched Sohu, which has gone even further than Positron in specialization by burning the transformer architecture directly into silicon.
Cerebras Systems, whose wafer-scale WSE-3 takes the opposite memory strategy, eliminating HBM in favor of enormous on-die SRAM.
SambaNova Systems and Tenstorrent, which sell more general-purpose dataflow architectures targeting both inference and training.

Compared with these rivals, Positron's distinguishing bets are (1) attaching the largest practical pool of HBM and host DDR5 to a transformer-specific datapath so that very large models fit in a single 2 kW chassis, and (2) pursuing a U.S.-domestic supply chain, with fabrication in Arizona and assembly in the United States, at a moment when both customers and the U.S. government are paying closer attention to onshore AI infrastructure.^[4]^[11]

References

"Positron AI says its Atlas accelerator beats Nvidia H200 on inference in just 33% of the power," Tom's Hardware. https://www.tomshardware.com/tech-industry/artificial-intelligence/positron-ai-says-its-atlas-accelerator-beats-nvidia-h200-on-inference-in-just-33-percent-of-the-power-delivers-280-tokens-per-second-per-user-with-llama-3-1-8b-in-2000w-envelope Accessed 2026-05-31. ↩
"AI Hardware Industry Veteran Mitesh Agrawal Joins Positron as CEO," Business Wire, 3 February 2025. https://www.businesswire.com/news/home/20250203502454/en/AI-Hardware-Industry-Veteran-Mitesh-Agrawal-Joins-Positron-as-CEO%E2%80%94A-Startup-Aiming-to-Challenge-Nvidia%E2%80%99s-Grip-on-AI-Infrastructure Accessed 2026-05-31. ↩
"Positron AI Raises $230M Series B at Over $1B Valuation to Scale Energy-Efficient AI Inference," The AI Insider, 4 February 2026. https://theaiinsider.tech/2026/02/04/positron-ai-raises-230m-series-b-at-over-1-billion-valuation-to-scale-energy-efficient-ai-inference/ Accessed 2026-05-31. ↩
"Exclusive: Positron raises $230M Series B to take on Nvidia's AI chips," TechCrunch, 4 February 2026. https://techcrunch.com/2026/02/04/exclusive-positron-raises-230m-series-b-to-take-on-nvidias-ai-chips/ Accessed 2026-05-31. ↩
"Positron believes it has found the secret to take on Nvidia in AI inference chips," VentureBeat. https://venturebeat.com/ai/positron-believes-it-has-found-the-secret-to-take-on-nvidia-in-ai-inference-chips-heres-how-it-could-benefit-enterprises Accessed 2026-05-31. ↩
"About Positron," Positron AI. https://www.positron.ai/about Accessed 2026-05-31. ↩
"Positron secures $23.5 million for AI innovation," Nevada Appeal, 23 February 2025. https://www.nevadaappeal.com/news/2025/feb/23/positron-secures-235-million-for-ai-innovation/ Accessed 2026-05-31. ↩
Thomas Sohmers, LinkedIn / Crunchbase founder profile (REX Computing, Terapute, Groq, Positron). https://www.linkedin.com/in/trsohmers/ Accessed 2026-05-31. ↩
Mitesh Agrawal, LinkedIn profile (Lambda COO, Positron CEO). https://www.linkedin.com/in/mitesh7/ Accessed 2026-05-31. ↩
"Lambda Labs' COO has left the AI cloud provider to head Positron, a startup trying to compete with Nvidia." https://bestofai.com/article/lambda-labs-coo-has-left-the-ai-cloud-provider-to-head-positron-a-startup-trying-to-compete-with-nvidia Accessed 2026-05-31. ↩
"Positron AI Closes $51.6M in Oversubscribed Series A," The AI Insider, 30 July 2025. https://theaiinsider.tech/2025/07/30/positron-ai-closes-51-6m-in-oversubscribed-series-a-to-accelerate-inference-optimized-hardware/ Accessed 2026-05-31. ↩
Atlas product page, Positron AI. https://www.positron.ai/atlas Accessed 2026-05-31. ↩
"Positron AI Secures $51.6 Million in Oversubscribed Series A," Business Wire, 28 July 2025. https://www.businesswire.com/news/home/20250728912387/en/Positron-AI-Secures-$51.6-Million-in-Oversubscribed-Series-A-to-Accelerate-Inference-Optimized-Hardware Accessed 2026-05-31. ↩
"AI Chip Startup Positron Raises $230 Million From Arm, Qatar to Compete With Nvidia," Bloomberg, 4 February 2026. https://www.bloomberg.com/news/articles/2026-02-04/ai-chip-startup-positron-raises-230-million-from-arm-qatar-to-compete-with-nvidia Accessed 2026-05-31. ↩
"Positron AI Raises $230 Million Series B at Over $1 Billion Valuation to Scale Energy-Efficient AI Inference," Business Wire, 4 February 2026. https://www.businesswire.com/news/home/20260204250472/en/Positron-AI-Raises-$230-Million-Series-B-at-Over-$1-Billion-Valuation-to-Scale-Energy-Efficient-AI-Inference Accessed 2026-05-31. ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

2 revisions by 1 contributor · full history

Suggest edit

What links here

Nvidia