Gladia

AI Companies Speech & Audio AI Voice AI

12 min read

Updated Jun 28, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 28, 2026

Fact-checked

In review queue

Sources

17 citations

Revision

v3 · 2,496 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Gladia is a French artificial-intelligence company that builds audio infrastructure for developers and voice-product teams, centered on a speech-to-text (transcription) API. Founded in 2022 and headquartered in Paris (with a US presence in New York), Gladia offers asynchronous (batch) and real-time streaming transcription across more than 100 languages, plus translation, speaker diarization, and audio-intelligence features such as summarization, sentiment analysis, and named-entity extraction. The company first launched its API in 2023 on an optimized version of OpenAI's Whisper model, then shipped its own proprietary Solaria model line beginning with Solaria-1 in 2025 ^[1]^[7]^[8]. Gladia targets meeting recorders, contact centers, and AI voice agents, and competes with Deepgram, AssemblyAI, Speechmatics, ElevenLabs, and the speech APIs of the large cloud providers. In June 2026, OVH Groupe (the parent of European cloud provider OVHcloud) entered exclusive negotiations to acquire Gladia ^[15].

What is Gladia?

Gladia sells an audio-intelligence API rather than a consumer application: developers send audio (uploaded files or live streams) and receive structured transcripts plus optional analytics such as summaries, sentiment, and redaction. The company describes its goal as building "end-to-end audio infrastructure" for voice-first platforms, and it has raised roughly $20 million in disclosed venture funding across a seed round and a Series A ^[1]^[2]^[3]. By June 2026 Gladia reported more than 300,000 users (which OVHcloud described as developers) and over 2,000 enterprise customers, up from the 70,000 users and 600 enterprise customers it cited at its October 2024 Series A ^[3]^[15]. Its API supports both asynchronous and real-time transcription in over 100 languages and bundles audio-intelligence features into its core product.

History

Who founded Gladia?

Gladia was founded in 2022 by Jean-Louis Quéguiner (CEO) and Jonathan Soto (CTO) ^[2]^[3]. Before starting the company, Quéguiner was the group vice president for data, AI, and quantum computing at OVHcloud, one of Europe's largest cloud providers, where he led the company's machine-learning efforts. He holds a master's degree in symbolic AI from the Université du Québec in Canada and from Arts et Métiers ParisTech in Paris ^[10]. Soto is a former MIT engineer ^[1].

Quéguiner has said the company grew out of a personal frustration: existing transcription services struggled to understand his French accent, which he attributed to models trained predominantly on English-language audio ^[10]. In interviews he framed Gladia's mission as making advanced speech AI accessible to any developer and removing the language and accent biases common in earlier systems. The company name is styled in lowercase as "gladia" in some of its branding, and the legal entity operates as Gladia SAS.

How much funding has Gladia raised?

Gladia has raised roughly $20.3 million in disclosed funding across a seed round and a Series A. The roster of investors skews toward European and US venture firms, with Sequoia Capital backing the seed round and the Franco-German fund XAnge leading the Series A ^[1]^[3].

Round	Date	Amount	Lead investor	Other investors
Seed	June 2023	$4 million	New Wave	Sequoia Capital, Cocoa, GFC, and angels including Solomon Hykes, Pierre Betouin, Miroslaw Klaba, and Alexandre Berriche
Series A	October 2024	$16 million	XAnge	Illuminate Financial, XTX Ventures, Athletico Ventures, Gaingels, Mana Ventures, Motier Ventures, Roosh Ventures, Soma Capital, New Wave (returning)

The $4 million seed round was announced in June 2023, led by New Wave with participation from Sequoia Capital, Cocoa, and several technology angels, among them Docker founder Solomon Hykes ^[1]. The company used the round to expand its transcription API beyond raw speech-to-text into features like translation, summarization, and content categorization.

In October 2024, Gladia closed a $16 million (about 14.7 million euro) Series A led by XAnge, with a broad syndicate of European and American investors and the return of seed backer New Wave ^[3]^[6]. The company announced the round alongside the launch of its real-time transcription engine, framing real-time processing as "the next frontier" for audio APIs ^[2]. At the time of the Series A, Gladia reported more than 70,000 users and over 600 enterprise customers, and it told reporters it was operating with positive margins ^[2]^[3]. The round was its most recent disclosed financing before the 2026 OVH Groupe transaction.

Why did OVH Groupe move to acquire Gladia?

On June 11, 2026, OVH Groupe announced it had entered into exclusive negotiations to acquire Gladia, describing it as a "French AI startup specializing in speech-to-text technology" and "expert in voice AI" ^[15]. OVH Groupe is the parent company of OVHcloud, where Gladia CEO Jean-Louis Quéguiner had previously led data, AI, and quantum computing, making the deal a return to his former employer. OVHcloud said the acquisition would let it internalize speech-to-text technology and offer new voice AI services through OVHcloud and its OVHai platform, strengthening its work on "sovereign" generative, agentic, and multimodal AI. In the announcement OVHcloud described Gladia as founded in 2022 in Paris, serving more than 300,000 developers and 2,000 enterprise customers, with named users including HeyGen, Livestorm, Attention, Circleback, Method Financial, Recall.ai, and Leexi ^[15]. No deal value or terms were disclosed, and the transaction was structured as exclusive negotiations rather than a completed acquisition.

What does Gladia's API do?

Gladia's product is organized around two transcription modes (asynchronous and real-time) and a set of audio-intelligence add-ons.

Asynchronous transcription

The original product is an asynchronous (batch) speech-to-text API that transcribes uploaded audio and video files. At launch in 2023 it could process roughly an hour of audio in about 60 seconds, which the company contrasted with competing APIs that it said could take more than 15 minutes for the same file ^[1]. Output is available in JSON, SRT, and VTT formats, and the API supports automatic punctuation and casing, language detection, timestamps, and speaker diarization (separating and labeling different speakers).

Real-time transcription

Gladia introduced a real-time streaming transcription product alongside its Series A in October 2024 ^[2]. The real-time engine targets latency under 300 milliseconds and is aimed at live use cases such as voice agents, contact-center assistance, and live captioning ^[3]. It is designed to integrate with telephony and voice stacks, with the company citing compatibility with SIP, VoIP, FreeSWITCH, and Asterisk. The company's pitch is to deliver "batch quality with real-time capabilities," closing the accuracy gap that has historically existed between streaming and offline transcription.

Quéguiner has described the customer behavior that drove the real-time product: "We realized that real time wasn't very good in terms of quality in the market in general. And people had a weird use case. They were doing real-time processing, and then they were grabbing the audio and running it in batch. We wondered: 'Why are you doing this?' They told us: 'The quality isn't good in real-time processing, so we transcribe it in batch afterwards'" ^[2].

Audio intelligence

On top of transcription, Gladia offers what it calls audio intelligence: generative-AI features that turn a transcript into structured insight. These include summarization, sentiment analysis, key-information and named-entity recognition, translation across supported languages, and PII (personally identifiable information) redaction. The features are aimed at contact centers and sales teams that want to extract data such as names, addresses, and account numbers from calls, and at meeting tools that want automatic notes and action items.

How does Gladia's technology work?

From Whisper to in-house models

Gladia's first API was built on OpenAI's open-source Whisper model, which the team modified to run faster and to reduce the "hallucinations" the base model is prone to ^[1]. Quéguiner gave a concrete example of the problem: Whisper, trained heavily on web video, would sometimes invent phrases like "if you enjoyed this video, please like and subscribe" when transcribing silence or noise. Gladia's engineering focused on inference speed and on suppressing these fabricated outputs while keeping the multilingual coverage that made Whisper attractive.

Over time the company moved from an optimized-Whisper approach toward its own proprietary models, which it markets as offering domain-specific terminology handling and contextual accuracy that a general-purpose open model does not provide out of the box. This shift culminated in the Solaria model line.

What is Solaria-1?

In April 2025, Gladia launched Solaria-1, a multilingual speech-to-text model it described as a next-generation system for global, real-time voice applications ^[7]^[8]. Solaria supports 100 languages, including a number of high-population and regional languages that the company says are underserved by other models, such as Tagalog, Bengali, Punjabi, Tamil, Urdu, Persian, Marathi, and Hebrew, as well as smaller languages like Haitian Creole, Maori, Javanese, and Malagasy ^[8].

Solaria-1 metric	Figure (company-reported)
Languages supported	100
Languages the company says are unique to Gladia	42
English word accuracy rate (WAR)	~94%
Average response latency	~270 ms
Final transcription latency	~698 ms
Conversational WER improvement vs. alternatives	~29% lower
Diarization error rate (DER) vs. alternatives	~3x lower

Gladia's published benchmarks claim Solaria delivers, on average, about 29% lower word error rate (WER) on conversational speech and roughly 3x lower diarization error rate (DER) than competing systems, drawing on a test set the company describes as 74+ hours of audio across seven datasets ^[8]. The company reports an English word accuracy rate near 94% and an average response latency around 270 milliseconds, and it compares these figures favorably against Deepgram and Speechmatics ^[7]^[8]. As with most vendor benchmarks, these numbers are self-reported and reflect Gladia's own evaluation methodology; real-world accuracy varies with audio quality, accents, overlapping speakers, and domain jargon.

Solaria adds real-time code-switching (handling multiple languages within a single conversation), real-time translation, custom vocabulary, and named-entity recognition. Gladia announced an integration with the open-source voice framework LiveKit to let developers build multilingual voice agents on top of the model.

What is Solaria-3?

On June 10, 2026, Gladia released Solaria-3, a speech-to-text model tuned for business audio in five core European languages: English, French, German, Spanish, and Italian ^[16]^[17]. Where Solaria-1 emphasized broad multilingual coverage and clean, read-style speech, Solaria-3 is optimized for noisy, real-world production audio such as calls and meetings. Maxime Gaudin, who Gladia identified as the company's CTO, said: "Solaria-3 is a genuine architectural leap, not an incremental update" ^[16].

Gladia reported the following word-error-rate (WER) benchmarks for Solaria-3, where lower is better ^[17]:

Benchmark (Solaria-3, company-reported)	WER
Earnings22 (cleaned)	6.4%
Internal English production dataset	9.6%
Noisy audio	1.4%
Multilingual LibriSpeech	8.0%
VoxPopuli (cleaned)	2.9%
Switchboard	33.9%

On the Earnings22 business-audio benchmark, Gladia reported Solaria-3 at 6.4% WER ahead of AssemblyAI (6.9%), ElevenLabs (7.7%), Speechmatics (7.8%), Mistral (7.9%), and Deepgram (12.0%) ^[17]. The company said Solaria-3 cut WER on its internal English production data by about 26% versus Solaria-1 (9.6% vs. 12.9%), while noting that Solaria-1 remains stronger on clean read-speech benchmarks like Multilingual LibriSpeech ^[17]. As with the Solaria-1 figures, these benchmarks are self-reported by Gladia and use its own methodology.

Positioning and competition

Gladia operates in the speech-to-text API market alongside several well-funded competitors. Its most direct rivals are developer-focused transcription companies: Deepgram and AssemblyAI in the United States, and Speechmatics in the United Kingdom. It also competes indirectly with the speech APIs of Amazon, Microsoft, and Google, and with OpenAI's Whisper (both the open model and OpenAI's hosted transcription endpoints), as well as adjacent voice players such as ElevenLabs.

Gladia's differentiation emphasizes multilingual breadth and accent robustness, low-latency real-time streaming, and bundling audio-intelligence features into its base pricing rather than charging separately for them. The company has historically competed on price as well: at the 2023 launch it quoted roughly $0.61 per hour of audio, against the $1.50 to $2 per hour it said competitors charged at the time ^[1]. In benchmark comparisons published by various parties, Deepgram is typically cited as a leader in low-latency real-time streaming and AssemblyAI in transcript intelligence (via its LeMUR framework for applying large language models to transcripts), while Gladia leans on conversational and multilingual accuracy as its core claim.

Adoption

Gladia's API became publicly available in the summer of 2023. By the time of its October 2024 Series A the company reported more than 70,000 users and over 600 enterprise customers, and by the April 2025 Solaria launch it cited more than 700 customers ^[2]^[3]^[7]. By June 2026, in connection with the OVH Groupe transaction, the company reported more than 300,000 users and 2,000 enterprise customers ^[15]. Named customers and partners across its announcements have included Attention, Circleback, Coconote, Method Financial, Recall (Recall.ai), Sana (Sana Labs), Ausha, VEED.IO, Claap, Livestorm, Selectra, HeyGen, and Leexi. These span meeting-assistant apps, podcast and media tools, sales-enablement products, and developer platforms that embed transcription into their own offerings.

The company maintains enterprise compliance certifications including GDPR, HIPAA, SOC 2 Type II, and ISO 27001, reflecting its focus on regulated sectors such as healthcare, finance, and contact centers.

References

"Gladia turns any audio into text in near real time." TechCrunch, June 19, 2023. https://techcrunch.com/2023/06/19/gladia-turns-any-audio-into-text-in-near-real-time/ ↩
"Gladia believes real-time processing is the next frontier of audio transcription APIs." TechCrunch, October 15, 2024. https://techcrunch.com/2024/10/15/gladia-believes-real-time-processing-is-the-next-frontier-of-audio-transcription-apis/ ↩
"Gladia Raises $16 Million in Series A Funding: Launches the First Multilingual Real-Time Audio Transcription and Analytics Engine." PR Newswire, October 15, 2024. https://www.prnewswire.com/news-releases/gladia-raises-16-million-in-series-a-funding-launches-the-first-multilingual-real-time-audio-transcription-and-analytics-engine-302275501.html ↩
"Multilingual AI: Paris-based Gladia raises $16m to make AI understand accents." Sifted, October 2024. https://sifted.eu/articles/gladia-raise-ai-france-news
"Venture Capital Firms Pile Into AI Transcription as Gladia Raises USD 16M in Series A." Slator, October 2024. https://slator.com/venture-capital-firms-pile-into-ai-transcription-gladia-raises-usd-16m-series-a/
"Paris-based Gladia raises 14.7 million euro to launch multilingual real-time audio transcription and analytics engine." EU-Startups, October 2024. https://www.eu-startups.com/2024/10/paris-based-gladia-raises-e14-7-million-to-launch-multilingual-real-time-audio-transcription-and-analytics-engine/ ↩
"French startup Gladia launches next-generation multilingual speech-to-text AI model Solaria." SiliconANGLE, April 2, 2025. https://siliconangle.com/2025/04/02/french-startup-gladia-launches-next-generation-multilingual-speech-text-ai-model-solaria/ ↩
"Introducing Solaria, the first truly universal speech-to-text model." Gladia Blog, April 2, 2025. https://www.gladia.io/blog/introducing-solaria-the-first-truly-universal-speech-to-text-model ↩
"Gladia Launches Solaria, the First Fully Multilingual, Next-Generation Speech-to-Text Model for Global Scalability." PR Newswire, April 2, 2025. https://www.prnewswire.com/news-releases/gladia-launches-solaria-the-first-fully-multilingual-next-generation-speech-to-text-model-for-global-scalability-302417497.html
"Jean-Louis Quéguiner, Founder & CEO of Gladia, Interview Series." Unite.AI. https://www.unite.ai/jean-louis-queguiner-founder-ceo-of-gladia-interview-series/ ↩
"About Us." Gladia. https://www.gladia.io/about-us
"Our Road to Real-Time Audio AI, with $16M in Series A Funding." Gladia Blog, October 2024. https://www.gladia.io/blog/road-to-real-time-audio-ai-series-a
"Gide advises XTX Ventures in the $16 million fundraising of Gladia." Gide Loyrette Nouel, 2024. https://www.gide.com/en/news/gide-advises-xtx-ventures-in-the-16-million-fundraising-of-gladia
"Speech-to-Text Benchmarks, Real Data, Open Methodology." Gladia. https://www.gladia.io/competitors/benchmarks
"OVH Groupe enters into exclusive negotiations to acquire Gladia, expert in voice AI." OVHcloud Newsroom, June 11, 2026. https://corporate.ovhcloud.com/en/newsroom/news/ovhgroupe-gladia-exclusive-negotiations-voice-ai/ ↩
"Gladia Launches Solaria-3, Its Most Accurate Speech-to-Text Model for Business Audio in Core European Languages." Greatreporter, June 10, 2026. https://greatreporter.com/2026/06/10/gladia-launches-solaria-3-its-most-accurate-speech-to-text-model-for-business-audio-in-core-european-languages/ ↩
"Introducing Solaria-3: The most accurate speech-to-text model for European languages." Gladia Blog, June 10, 2026. https://www.gladia.io/blog/solaria-3-speech-to-text-model-for-european-languages ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

2 revisions by 1 contributors · full history

Suggest edit

What links here

LiveKit Agents Pipecat Vapi

What is Gladia?

History

Who founded Gladia?

How much funding has Gladia raised?

Why did OVH Groupe move to acquire Gladia?

What does Gladia's API do?

Asynchronous transcription

Real-time transcription

Audio intelligence

How does Gladia's technology work?

From Whisper to in-house models

What is Solaria-1?

What is Solaria-3?

Positioning and competition

Adoption

Related

References

Improve this article

Related Articles

ElevenLabs

Sesame (AI company)

Hume AI

Cartesia

AssemblyAI

Inworld AI

What links here

Related Articles

ElevenLabs

Sesame (AI company)

Hume AI

Cartesia

AssemblyAI

Inworld AI

What links here