Groq

AI Companies AI Hardware

11 min read

Updated Jun 21, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 21, 2026

Fact-checked

In review queue

Sources

22 citations

Revision

v3 · 2,259 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Groq, Inc. is an American semiconductor and cloud computing company, headquartered in Mountain View, California, that builds processors and data center infrastructure for AI inference, the serving of trained models rather than their training. Founded in 2016 by Jonathan Ross, one of the creators of Google's tensor processing unit (TPU), Groq designed the language processing unit (LPU), a deterministic, SRAM-only chip that Groq calls "a new category of processor" built "from the ground up to meet the unique needs of AI" ^[4]. The LPU became known in 2024 for serving large language models at record speeds, more than 300 tokens per second per user on Llama 2 70B, through its GroqCloud platform ^[1]^[2]^[4]. In December 2025, Nvidia agreed to pay approximately $20 billion in cash, the largest transaction in its history, for a non-exclusive license to Groq's inference technology; Ross and most of Groq's senior leadership joined Nvidia, while Groq itself remained an independent company operating GroqCloud ^[14]^[15]. Groq is unrelated to Grok, the chatbot from Elon Musk's xAI, a similarity the company has publicly contested ^[22].

Overview

Groq's pitch inverted the usual accelerator playbook: instead of chasing peak training throughput, it optimized a chip and compiler stack for low-latency, predictable token generation. The company claims its LPU-based systems run large language models substantially faster and up to 10 times more energy-efficiently, on an architectural level, than GPU-based serving ^[4]^[6]. After pivoting from hardware sales to a tokens-as-a-service cloud business in 2024, Groq grew from roughly 356,000 registered developers in late 2024 to more than two million by September 2025, alongside Fortune 500 customers ^[6]^[7]. Its valuation rose in step: $2.8 billion in August 2024, $6.9 billion in September 2025, and an effective price near $20 billion in the Nvidia deal three months later ^[5]^[6]^[14]. "Inference is defining this era of AI," founder Jonathan Ross said when Groq raised $750 million in September 2025, "and we're building the American infrastructure that delivers it with high speed and low cost" ^[7].

When was Groq founded, and by whom?

Ross began the project that became Google's TPU as a 20 percent side effort and helped deploy it across Google's data centers before leaving in 2016 to start Groq with Douglas Wightman, a former Google X engineer ^[1]^[2]. Chamath Palihapitiya's Social Capital provided early backing of about $10 million, disclosed in 2017, and the startup spent its first years in relative stealth ^[1].

Groq's first chip, described in a 2020 paper at the ISCA computer architecture conference, was called the Tensor Streaming Processor (TSP); the company rebranded the architecture as the language processing unit after ChatGPT made LLM serving the dominant accelerator workload ^[1]^[2]. Along the way Groq acquired Maxeler Technologies, a dataflow computing firm, in March 2022, and announced in August 2023 that its next-generation chip would be fabricated on Samsung's 4 nm process at the foundry's Taylor, Texas plant ^[1].

The breakout came in early 2024, when public demos of Llama 2 70B generating more than 300 tokens per second per user went viral and the company soft-launched GroqCloud ^[3]^[4]. In March 2024 Groq acquired Definitive Intelligence, whose co-founder Sunny Madra went on to run GroqCloud and later became Groq's president ^[1]^[6].

How does the LPU architecture work?

The LPU departs from GPU design in two linked ways: determinism and memory. The chip is a single large core with a "functionally sliced" layout, in which memory units are interleaved with vector and matrix units, and it omits the speculative machinery of conventional processors: there are no branch predictors, caches, or reorder buffers ^[2]^[3]. Because the hardware's timing is fully predictable, Groq's compiler statically schedules every operation, memory access, and inter-chip packet down to the clock cycle, which the company argues removes the tail latency that batching and dynamic scheduling create on GPUs ^[3]^[4].

Instead of external high-bandwidth memory, each first-generation LPU carries 230 MB of on-chip SRAM as its primary weight storage, delivering on the order of 80 TB/s of internal bandwidth, far more than HBM-based GPU stacks ^[3]. The trade-off is capacity: a single chip holds only a fraction of a modern model's weights, so production deployments shard one model across hundreds of interconnected LPUs acting as a synchronized assembly line ^[3]^[21]. That makes individual answers very fast but requires large fleets per model. (Groq's chip and rack engineering is covered in more detail at Groq hardware.)

First-generation GroqChip (TSP)	Specification
Process node	GlobalFoundries 14 nm ^[2]
Die size / transistors	25 mm by 29 mm, 26.8 billion transistors ^[2]
Nominal clock	900 MHz ^[2]
Peak compute	820 teraoperations per second ^[2]
On-chip SRAM	230 MB at roughly 80 TB/s ^[3]
External DRAM/HBM	None ^[3]
Early benchmark	ResNet-50 at 20,400 images per second, batch size 1 ^[2]

On LLM workloads, Groq reported over 300 tokens per second on Llama 2 70B and about 480 tokens per second on Mixtral 8x7B in early 2024, several times faster than contemporary GPU endpoints ^[4]^[20]. A planned second-generation LPU moves from 14 nm to Samsung's 4 nm node ^[1].

What is GroqCloud?

GroqCloud, launched in February 2024, exposes LPU clusters through an OpenAI-compatible API and a self-serve console with free and paid tiers, selling inference as metered tokens rather than hardware ^[1]^[4]^[6]. The catalog focuses on open-weight models, including Meta's Llama family, Mistral's Mixtral, Google's Gemma, Whisper for speech recognition, and OpenAI's gpt-oss models ^[4]^[21].

The platform became the company's growth engine: its $640 million Series D was explicitly raised to expand tokens-as-a-service capacity ^[5]. By May 2025 Groq operated data centers across North America, Europe, and the Middle East, including new sites in Houston (with DataBank) and Dallas (with Equinix), and said its network could serve more than 20 million tokens per second ^[12]. TechCrunch reported more than two million developers on the platform as of September 2025 ^[6].

How much funding has Groq raised?

Date	Round	Amount	Valuation	Lead / notable investors
2017	Early venture	~$10M	n/a	Social Capital ^[1]
April 2021	Series C	$300M	>$1B	Tiger Global Management, D1 Capital Partners ^[1]
August 2024	Series D	$640M	$2.8B	BlackRock Private Equity Partners; Neuberger Berman, Type One Ventures, Cisco Investments, KDDI's Global Brain fund, Samsung Catalyst Fund ^[5]
September 2025	Growth round	$750M	$6.9B post-money	Disruptive; BlackRock, Neuberger Berman, Deutsche Telekom Capital Partners, Samsung, Cisco, D1, Altimeter, 1789 Capital, Infinitum ^[6]^[7]
May 2026	Internal round	up to $650M	undisclosed	Existing investors pro rata, backstopped by Disruptive and Infinitum ^[18]^[19]

Separately, the Kingdom of Saudi Arabia announced a $1.5 billion commitment at the LEAP 2025 conference in February 2025 to fund expanded delivery of Groq's inference infrastructure in the country ^[8].

Growth was not linear. In July 2025, The Information reported that Groq had cut its 2025 revenue projection from more than $2 billion to a bit over $500 million, citing delays in securing data center capacity, shortly after sharing the higher figure with investors ^[13].

What is the Saudi Arabia deal?

Groq's flagship international deployment is in Saudi Arabia. After signing a memorandum of understanding with Aramco Digital at LEAP 2024, the companies announced progress that September on what they described as the world's largest AI inferencing data center ^[9]. In December 2024, Groq airlifted racks to Dammam and brought a cluster of about 19,000 LPUs online in eight days, which the partners called the largest AI compute hub in the EMEA region and Groq's second GroqCloud region globally ^[10]. The February 2025 announcement of $1.5 billion in Saudi backing extended that buildout, which also supports the Saudi Data and Artificial Intelligence Authority's Arabic language model ALLaM ^[8]. In May 2025, HUMAIN, the AI holding company backed by Saudi Arabia's Public Investment Fund and chaired by Crown Prince Mohammed bin Salman, selected Groq as its inference provider ^[11].

In Canada, Groq became the exclusive inference provider for Bell Canada's "Bell AI Fabric" sovereign AI network, announced in May 2025: a planned six-site buildout targeting 500 MW of hydro-powered capacity, beginning with a 7 MW Groq facility in Kamloops, British Columbia ^[12].

What was the Nvidia licensing deal?

On December 24, 2025, CNBC reported that Nvidia would acquire assets from Groq for about $20 billion in cash, by far the largest deal in Nvidia's history and roughly three times Groq's September valuation ^[14]. Both companies framed the transaction as a non-exclusive licensing agreement for Groq's inference technology rather than an acquisition: Nvidia said it was "not an acquisition of the company," and Groq retained the right to keep using and licensing its own technology ^[15]^[16]. Ross, president Sunny Madra, and other senior leaders moved to Nvidia, while Groq said it would continue as an independent company with chief financial officer Simon Edwards stepping up as CEO and GroqCloud operating without interruption ^[15]^[16]. Analysts widely read the structure as an acqui-hire that removed a competitor while sidestepping merger review ^[16].

The aftermath reshaped the remaining company. Edwards departed in April 2026 to become CFO of Bloom Energy ^[17]. By late May 2026, Groq was led by interim CEO Adam Winter and interim CFO Matt Eng and was raising up to $650 million from existing investors, with Disruptive and Infinitum committed to backstopping the round, to fund a "second act" as an inference neocloud, a managed cloud built on its existing LPU fleet rather than on new chip development ^[18]^[19].

How does Groq compare to Cerebras and SambaNova?

Groq competed most directly with two other US inference-chip startups, Cerebras Systems, whose wafer-scale engines take the opposite approach of one enormous chip, and SambaNova Systems, whose reconfigurable dataflow units pair SRAM with HBM and DRAM to fit larger models per node, as well as with the Nvidia GPU clouds that dominate the market ^[20]. The three challengers fought a public "token war" on Artificial Analysis leaderboards: in late 2024 measurements on Llama 3.1 8B, Groq served about 750 tokens per second against SambaNova's 1,084 and Cerebras's 1,800, while Nvidia H100-based clouds ranged from 72 to 257; on Llama 3.1 70B the three were closely matched, with Groq at 544 tokens per second ^[20]. Cerebras has since claimed multi-fold speed advantages over Groq on newer models such as gpt-oss-120B, citing the same benchmarking firm ^[21].

Groq's counterarguments centered on cost per token, deterministic latency, and deployment speed, exemplified by the eight-day Dammam installation ^[10]. The December 2025 Nvidia license was widely interpreted as validation that the LPU's deterministic, SRAM-centric design mattered enough for the GPU incumbent to pay a record sum for it ^[14]^[16].

Why is Groq confused with Grok?

Groq's name, like that of xAI's Grok chatbot released in November 2023, derives from the verb "to grok" coined by science-fiction author Robert A. Heinlein. Groq registered its trademark when it was founded in 2016, and in November 2023 it published a tongue-in-cheek public cease-and-desist blog post titled "Hey Elon: It's Time To Cease & De-Grok," asserting its prior claim to the name ^[22]. The two companies remain unaffiliated, and no public resolution of the dispute has been reported.

References

Wikipedia. "Groq." https://en.wikipedia.org/wiki/Groq ↩
Abts, D. et al. "Think Fast: A Tensor Streaming Processor (TSP) for Accelerating Deep Learning Workloads." ISCA 2020, via Groq. https://groq.com/groq-isca-paper-2020/ ↩
Groq. "Inside the LPU: Deconstructing Groq's Speed." https://groq.com/blog/inside-the-lpu-deconstructing-groq-speed ↩
Groq. "What is a Language Processing Unit?" https://groq.com/blog/the-groq-lpu-explained ↩
PR Newswire. "Groq Raises $640M To Meet Soaring Demand for Fast AI Inference." August 5, 2024. https://www.prnewswire.com/news-releases/groq-raises-640m-to-meet-soaring-demand-for-fast-ai-inference-302214097.html ↩
TechCrunch. "Nvidia AI chip challenger Groq raises even more than expected, hits $6.9B valuation." September 17, 2025. https://techcrunch.com/2025/09/17/nvidia-ai-chip-challenger-groq-raises-even-more-than-expected-hits-6-9b-valuation/ ↩
Groq. "Groq Raises $750 Million as Inference Demand Surges." September 17, 2025. https://groq.com/newsroom/groq-raises-750-million-as-inference-demand-surges ↩
Data Center Dynamics. "Groq secures $1.5bn from Saudi Arabia to expand AI inference infrastructure in the region." February 2025. https://www.datacenterdynamics.com/en/news/groq-secures-15bn-from-saudi-arabia-to-expand-ai-inference-infrastructure-in-the-region/ ↩
PR Newswire. "Aramco Digital and Groq Announce Progress in Building the World's Largest Inferencing Data Center in Saudi Arabia Following LEAP MOU Signing." September 2024. https://www.prnewswire.com/news-releases/aramco-digital-and-groq-announce-progress-in-building-the-worlds-largest-inferencing-data-center-in-saudi-arabia-following-leap-mou-signing-302245875.html ↩
Middle East AI News. "Groq opens EMEA's largest AI compute centre in Saudi Arabia." February 2025. https://www.middleeastainews.com/p/groq-emea-largest-ai-compute-centre ↩
Semafor. "Saudi's new AI firm picks US chipmaker Groq for inference." May 13, 2025. https://www.semafor.com/article/05/13/2025/saudis-new-ai-firm-picks-us-chipmaker-groq-for-inference ↩
PR Newswire. "Groq Becomes Exclusive Inference Provider for Bell Canada's Sovereign AI Network." May 28, 2025. https://www.prnewswire.com/news-releases/groq-becomes-exclusive-inference-provider-for-bell-canadas-sovereign-ai-network-302467175.html ↩
Investing.com, citing The Information. "Groq slashes 2025 revenue projections to $500 million." July 2025. https://www.investing.com/news/company-news/groq-slashes-2025-revenue-projections-to-500-million--the-information-93CH-4158309 ↩
CNBC. "Nvidia buying AI chip startup Groq's assets for about $20 billion in its largest deal on record." December 24, 2025. https://www.cnbc.com/2025/12/24/nvidia-buying-ai-chip-startup-groq-for-about-20-billion-biggest-deal.html ↩
Groq. "Groq and Nvidia Enter Non-Exclusive Inference Technology Licensing Agreement to Accelerate AI Inference at Global Scale." December 2025. https://groq.com/newsroom/groq-and-nvidia-enter-non-exclusive-inference-technology-licensing-agreement-to-accelerate-ai-inference-at-global-scale ↩
Data Center Dynamics. "Nvidia to license tech from AI inference chip company Groq, hire its leadership." December 2025. https://www.datacenterdynamics.com/en/news/nvidia-to-license-tech-from-ai-inference-chip-company-groq-hire-its-leadership/ ↩
Bloom Energy. "Bloom Energy Appoints Simon Edwards as Chief Financial Officer." April 2026. https://www.bloomenergy.com/news/bloom-energy-appoints-simon-edwards-as-chief-financial-officer/ ↩
Axios. "Scoop: Groq raising $650 million for its second act." May 28, 2026. https://www.axios.com/2026/05/28/groq-650-million-nvidia ↩
TechCrunch. "After Nvidia's $20B not-acqui-hire, AI chip startup Groq reportedly raising $650M." May 29, 2026. https://techcrunch.com/2026/05/29/after-nvidias-20b-not-acqui-hire-ai-chip-startup-groq-reportedly-raising-650m/ ↩
EE Times. "'Token Wars' Heats Up As Cerebras and SambaNova Enter The Fray." 2024. https://www.eetimes.com/token-wars-heats-up-as-cerebras-and-sambanova-enter-the-fray/ ↩
Cerebras. "Cerebras CS-3 vs. Groq LPU." https://www.cerebras.ai/blog/cerebras-cs-3-vs-groq-lpu ↩
Techdirt. "Groq Sends Elon's 'Grok' A Cease & Desist, Though A Funny One." November 30, 2023. https://www.techdirt.com/2023/11/30/groq-sends-elons-grok-a-cease-desist-though-a-funny-one/ ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

2 revisions by 1 contributors · full history

Suggest edit

Groq

Overview

When was Groq founded, and by whom?

How does the LPU architecture work?

What is GroqCloud?

How much funding has Groq raised?

What is the Saudi Arabia deal?

What was the Nvidia licensing deal?

How does Groq compare to Cerebras and SambaNova?

Why is Groq confused with Grok?

References

Improve this article

What links here

What links here

Overview

When was Groq founded, and by whom?

How does the LPU architecture work?

What is GroqCloud?

How much funding has Groq raised?

What is the Saudi Arabia deal?

What was the Nvidia licensing deal?

How does Groq compare to Cerebras and SambaNova?

Why is Groq confused with Grok?

References

Improve this article

Related Articles

Nvidia

Groq

Rabbit R1

Humane AI Pin

SambaNova Systems

Samsung AI

What links here

Related Articles

Nvidia

Groq

Rabbit R1

Humane AI Pin

SambaNova Systems

Samsung AI

What links here