# Groq

> Source: https://aiwiki.ai/wiki/groq
> Updated: 2026-06-21
> Categories: AI Companies, AI Hardware
> License: CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
> From AI Wiki (https://aiwiki.ai), the free encyclopedia of artificial intelligence. Reuse freely with attribution to "AI Wiki (aiwiki.ai)".

**Groq, Inc.** is an American semiconductor and cloud computing company, headquartered in Mountain View, California, that builds processors and data center infrastructure for [AI inference](/wiki/inference), the serving of trained models rather than their training. Founded in 2016 by [Jonathan Ross](/wiki/jonathan_ross), one of the creators of [Google](/wiki/google)'s [tensor processing unit](/wiki/tensor_processing_unit) (TPU), Groq designed the language processing unit (LPU), a deterministic, SRAM-only chip that Groq calls "a new category of processor" built "from the ground up to meet the unique needs of AI" [4]. The LPU became known in 2024 for serving [large language models](/wiki/large_language_model) at record speeds, more than 300 tokens per second per user on [Llama 2](/wiki/llama_2) 70B, through its GroqCloud platform [1][2][4]. In December 2025, [Nvidia](/wiki/nvidia) agreed to pay approximately $20 billion in cash, the largest transaction in its history, for a non-exclusive license to Groq's inference technology; Ross and most of Groq's senior leadership joined Nvidia, while Groq itself remained an independent company operating GroqCloud [14][15]. Groq is unrelated to [Grok](/wiki/grok), the chatbot from Elon Musk's [xAI](/wiki/xai), a similarity the company has publicly contested [22].

## Overview

Groq's pitch inverted the usual accelerator playbook: instead of chasing peak training throughput, it optimized a chip and compiler stack for low-latency, predictable token generation. The company claims its LPU-based systems run large language models substantially faster and up to 10 times more energy-efficiently, on an architectural level, than [GPU](/wiki/gpu)-based serving [4][6]. After pivoting from hardware sales to a tokens-as-a-service cloud business in 2024, Groq grew from roughly 356,000 registered developers in late 2024 to more than two million by September 2025, alongside Fortune 500 customers [6][7]. Its valuation rose in step: $2.8 billion in August 2024, $6.9 billion in September 2025, and an effective price near $20 billion in the Nvidia deal three months later [5][6][14]. "Inference is defining this era of AI," founder Jonathan Ross said when Groq raised $750 million in September 2025, "and we're building the American infrastructure that delivers it with high speed and low cost" [7].

## When was Groq founded, and by whom?

Ross began the project that became Google's TPU as a 20 percent side effort and helped deploy it across Google's data centers before leaving in 2016 to start Groq with Douglas Wightman, a former Google X engineer [1][2]. Chamath Palihapitiya's Social Capital provided early backing of about $10 million, disclosed in 2017, and the startup spent its first years in relative stealth [1].

Groq's first chip, described in a 2020 paper at the ISCA computer architecture conference, was called the Tensor Streaming Processor (TSP); the company rebranded the architecture as the language processing unit after [ChatGPT](/wiki/chatgpt) made LLM serving the dominant accelerator workload [1][2]. Along the way Groq acquired Maxeler Technologies, a dataflow computing firm, in March 2022, and announced in August 2023 that its next-generation chip would be fabricated on Samsung's 4 nm process at the foundry's Taylor, Texas plant [1].

The breakout came in early 2024, when public demos of Llama 2 70B generating more than 300 tokens per second per user went viral and the company soft-launched GroqCloud [3][4]. In March 2024 Groq acquired Definitive Intelligence, whose co-founder Sunny Madra went on to run GroqCloud and later became Groq's president [1][6].

## How does the LPU architecture work?

The LPU departs from GPU design in two linked ways: determinism and memory. The chip is a single large core with a "functionally sliced" layout, in which memory units are interleaved with vector and matrix units, and it omits the speculative machinery of conventional processors: there are no branch predictors, caches, or reorder buffers [2][3]. Because the hardware's timing is fully predictable, Groq's compiler statically schedules every operation, memory access, and inter-chip packet down to the clock cycle, which the company argues removes the tail latency that batching and dynamic scheduling create on GPUs [3][4].

Instead of external [high-bandwidth memory](/wiki/high_bandwidth_memory), each first-generation LPU carries 230 MB of on-chip SRAM as its primary weight storage, delivering on the order of 80 TB/s of internal bandwidth, far more than HBM-based GPU stacks [3]. The trade-off is capacity: a single chip holds only a fraction of a modern model's weights, so production deployments shard one model across hundreds of interconnected LPUs acting as a synchronized assembly line [3][21]. That makes individual answers very fast but requires large fleets per model. (Groq's chip and rack engineering is covered in more detail at [Groq hardware](/wiki/groq_hardware).)

| First-generation GroqChip (TSP) | Specification |
|---|---|
| Process node | GlobalFoundries 14 nm [2] |
| Die size / transistors | 25 mm by 29 mm, 26.8 billion transistors [2] |
| Nominal clock | 900 MHz [2] |
| Peak compute | 820 teraoperations per second [2] |
| On-chip SRAM | 230 MB at roughly 80 TB/s [3] |
| External DRAM/HBM | None [3] |
| Early benchmark | ResNet-50 at 20,400 images per second, batch size 1 [2] |

On LLM workloads, Groq reported over 300 tokens per second on Llama 2 70B and about 480 tokens per second on [Mixtral 8x7B](/wiki/mixtral_8x7b) in early 2024, several times faster than contemporary GPU endpoints [4][20]. A planned second-generation LPU moves from 14 nm to Samsung's 4 nm node [1].

## What is GroqCloud?

GroqCloud, launched in February 2024, exposes LPU clusters through an OpenAI-compatible API and a self-serve console with free and paid tiers, selling inference as metered tokens rather than hardware [1][4][6]. The catalog focuses on open-weight models, including Meta's Llama family, Mistral's Mixtral, Google's [Gemma](/wiki/gemma), [Whisper](/wiki/whisper) for speech recognition, and OpenAI's [gpt-oss](/wiki/gpt_oss) models [4][21].

The platform became the company's growth engine: its $640 million Series D was explicitly raised to expand tokens-as-a-service capacity [5]. By May 2025 Groq operated data centers across North America, Europe, and the Middle East, including new sites in Houston (with DataBank) and Dallas (with Equinix), and said its network could serve more than 20 million tokens per second [12]. TechCrunch reported more than two million developers on the platform as of September 2025 [6].

## How much funding has Groq raised?

| Date | Round | Amount | Valuation | Lead / notable investors |
|---|---|---|---|---|
| 2017 | Early venture | ~$10M | n/a | Social Capital [1] |
| April 2021 | Series C | $300M | >$1B | Tiger Global Management, D1 Capital Partners [1] |
| August 2024 | Series D | $640M | $2.8B | BlackRock Private Equity Partners; Neuberger Berman, Type One Ventures, Cisco Investments, KDDI's Global Brain fund, Samsung Catalyst Fund [5] |
| September 2025 | Growth round | $750M | $6.9B post-money | Disruptive; BlackRock, Neuberger Berman, Deutsche Telekom Capital Partners, Samsung, Cisco, D1, Altimeter, 1789 Capital, Infinitum [6][7] |
| May 2026 | Internal round | up to $650M | undisclosed | Existing investors pro rata, backstopped by Disruptive and Infinitum [18][19] |

Separately, the Kingdom of Saudi Arabia announced a $1.5 billion commitment at the LEAP 2025 conference in February 2025 to fund expanded delivery of Groq's inference infrastructure in the country [8].

Growth was not linear. In July 2025, The Information reported that Groq had cut its 2025 revenue projection from more than $2 billion to a bit over $500 million, citing delays in securing data center capacity, shortly after sharing the higher figure with investors [13].

## What is the Saudi Arabia deal?

Groq's flagship international deployment is in Saudi Arabia. After signing a memorandum of understanding with Aramco Digital at LEAP 2024, the companies announced progress that September on what they described as the world's largest AI inferencing data center [9]. In December 2024, Groq airlifted racks to Dammam and brought a cluster of about 19,000 LPUs online in eight days, which the partners called the largest AI compute hub in the EMEA region and Groq's second GroqCloud region globally [10]. The February 2025 announcement of $1.5 billion in Saudi backing extended that buildout, which also supports the Saudi Data and Artificial Intelligence Authority's Arabic language model ALLaM [8]. In May 2025, [HUMAIN](/wiki/humain), the AI holding company backed by Saudi Arabia's Public Investment Fund and chaired by Crown Prince Mohammed bin Salman, selected Groq as its inference provider [11].

In Canada, Groq became the exclusive inference provider for Bell Canada's "Bell AI Fabric" sovereign AI network, announced in May 2025: a planned six-site buildout targeting 500 MW of hydro-powered capacity, beginning with a 7 MW Groq facility in Kamloops, British Columbia [12].

## What was the Nvidia licensing deal?

On December 24, 2025, CNBC reported that Nvidia would acquire assets from Groq for about $20 billion in cash, by far the largest deal in Nvidia's history and roughly three times Groq's September valuation [14]. Both companies framed the transaction as a non-exclusive licensing agreement for Groq's inference technology rather than an acquisition: Nvidia said it was "not an acquisition of the company," and Groq retained the right to keep using and licensing its own technology [15][16]. Ross, president Sunny Madra, and other senior leaders moved to Nvidia, while Groq said it would continue as an independent company with chief financial officer Simon Edwards stepping up as CEO and GroqCloud operating without interruption [15][16]. Analysts widely read the structure as an acqui-hire that removed a competitor while sidestepping merger review [16].

The aftermath reshaped the remaining company. Edwards departed in April 2026 to become CFO of Bloom Energy [17]. By late May 2026, Groq was led by interim CEO Adam Winter and interim CFO Matt Eng and was raising up to $650 million from existing investors, with Disruptive and Infinitum committed to backstopping the round, to fund a "second act" as an inference neocloud, a managed cloud built on its existing LPU fleet rather than on new chip development [18][19].

## How does Groq compare to Cerebras and SambaNova?

Groq competed most directly with two other US inference-chip startups, [Cerebras](/wiki/cerebras) Systems, whose wafer-scale engines take the opposite approach of one enormous chip, and [SambaNova](/wiki/sambanova) Systems, whose reconfigurable dataflow units pair SRAM with HBM and DRAM to fit larger models per node, as well as with the Nvidia GPU clouds that dominate the market [20]. The three challengers fought a public "token war" on [Artificial Analysis](/wiki/artificial_analysis) leaderboards: in late 2024 measurements on Llama 3.1 8B, Groq served about 750 tokens per second against SambaNova's 1,084 and Cerebras's 1,800, while Nvidia H100-based clouds ranged from 72 to 257; on Llama 3.1 70B the three were closely matched, with Groq at 544 tokens per second [20]. Cerebras has since claimed multi-fold speed advantages over Groq on newer models such as gpt-oss-120B, citing the same benchmarking firm [21].

Groq's counterarguments centered on cost per token, deterministic latency, and deployment speed, exemplified by the eight-day Dammam installation [10]. The December 2025 Nvidia license was widely interpreted as validation that the LPU's deterministic, SRAM-centric design mattered enough for the GPU incumbent to pay a record sum for it [14][16].

## Why is Groq confused with Grok?

Groq's name, like that of xAI's Grok chatbot released in November 2023, derives from the verb "to grok" coined by science-fiction author Robert A. Heinlein. Groq registered its trademark when it was founded in 2016, and in November 2023 it published a tongue-in-cheek public cease-and-desist blog post titled "Hey Elon: It's Time To Cease & De-Grok," asserting its prior claim to the name [22]. The two companies remain unaffiliated, and no public resolution of the dispute has been reported.

## References

1. Wikipedia. "Groq." https://en.wikipedia.org/wiki/Groq
2. Abts, D. et al. "Think Fast: A Tensor Streaming Processor (TSP) for Accelerating Deep Learning Workloads." ISCA 2020, via Groq. https://groq.com/groq-isca-paper-2020/
3. Groq. "Inside the LPU: Deconstructing Groq's Speed." https://groq.com/blog/inside-the-lpu-deconstructing-groq-speed
4. Groq. "What is a Language Processing Unit?" https://groq.com/blog/the-groq-lpu-explained
5. PR Newswire. "Groq Raises $640M To Meet Soaring Demand for Fast AI Inference." August 5, 2024. https://www.prnewswire.com/news-releases/groq-raises-640m-to-meet-soaring-demand-for-fast-ai-inference-302214097.html
6. TechCrunch. "Nvidia AI chip challenger Groq raises even more than expected, hits $6.9B valuation." September 17, 2025. https://techcrunch.com/2025/09/17/nvidia-ai-chip-challenger-groq-raises-even-more-than-expected-hits-6-9b-valuation/
7. Groq. "Groq Raises $750 Million as Inference Demand Surges." September 17, 2025. https://groq.com/newsroom/groq-raises-750-million-as-inference-demand-surges
8. Data Center Dynamics. "Groq secures $1.5bn from Saudi Arabia to expand AI inference infrastructure in the region." February 2025. https://www.datacenterdynamics.com/en/news/groq-secures-15bn-from-saudi-arabia-to-expand-ai-inference-infrastructure-in-the-region/
9. PR Newswire. "Aramco Digital and Groq Announce Progress in Building the World's Largest Inferencing Data Center in Saudi Arabia Following LEAP MOU Signing." September 2024. https://www.prnewswire.com/news-releases/aramco-digital-and-groq-announce-progress-in-building-the-worlds-largest-inferencing-data-center-in-saudi-arabia-following-leap-mou-signing-302245875.html
10. Middle East AI News. "Groq opens EMEA's largest AI compute centre in Saudi Arabia." February 2025. https://www.middleeastainews.com/p/groq-emea-largest-ai-compute-centre
11. Semafor. "Saudi's new AI firm picks US chipmaker Groq for inference." May 13, 2025. https://www.semafor.com/article/05/13/2025/saudis-new-ai-firm-picks-us-chipmaker-groq-for-inference
12. PR Newswire. "Groq Becomes Exclusive Inference Provider for Bell Canada's Sovereign AI Network." May 28, 2025. https://www.prnewswire.com/news-releases/groq-becomes-exclusive-inference-provider-for-bell-canadas-sovereign-ai-network-302467175.html
13. Investing.com, citing The Information. "Groq slashes 2025 revenue projections to $500 million." July 2025. https://www.investing.com/news/company-news/groq-slashes-2025-revenue-projections-to-500-million--the-information-93CH-4158309
14. CNBC. "Nvidia buying AI chip startup Groq's assets for about $20 billion in its largest deal on record." December 24, 2025. https://www.cnbc.com/2025/12/24/nvidia-buying-ai-chip-startup-groq-for-about-20-billion-biggest-deal.html
15. Groq. "Groq and Nvidia Enter Non-Exclusive Inference Technology Licensing Agreement to Accelerate AI Inference at Global Scale." December 2025. https://groq.com/newsroom/groq-and-nvidia-enter-non-exclusive-inference-technology-licensing-agreement-to-accelerate-ai-inference-at-global-scale
16. Data Center Dynamics. "Nvidia to license tech from AI inference chip company Groq, hire its leadership." December 2025. https://www.datacenterdynamics.com/en/news/nvidia-to-license-tech-from-ai-inference-chip-company-groq-hire-its-leadership/
17. Bloom Energy. "Bloom Energy Appoints Simon Edwards as Chief Financial Officer." April 2026. https://www.bloomenergy.com/news/bloom-energy-appoints-simon-edwards-as-chief-financial-officer/
18. Axios. "Scoop: Groq raising $650 million for its second act." May 28, 2026. https://www.axios.com/2026/05/28/groq-650-million-nvidia
19. TechCrunch. "After Nvidia's $20B not-acqui-hire, AI chip startup Groq reportedly raising $650M." May 29, 2026. https://techcrunch.com/2026/05/29/after-nvidias-20b-not-acqui-hire-ai-chip-startup-groq-reportedly-raising-650m/
20. EE Times. "'Token Wars' Heats Up As Cerebras and SambaNova Enter The Fray." 2024. https://www.eetimes.com/token-wars-heats-up-as-cerebras-and-sambanova-enter-the-fray/
21. Cerebras. "Cerebras CS-3 vs. Groq LPU." https://www.cerebras.ai/blog/cerebras-cs-3-vs-groq-lpu
22. Techdirt. "Groq Sends Elon's 'Grok' A Cease & Desist, Though A Funny One." November 30, 2023. https://www.techdirt.com/2023/11/30/groq-sends-elons-grok-a-cease-desist-though-a-funny-one/