Gladia
Last reviewed
Jun 4, 2026
Sources
14 citations
Review status
Source-backed
Revision
v1 · 1,787 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 4, 2026
Sources
14 citations
Review status
Source-backed
Revision
v1 · 1,787 words
Add missing citations, update stale details, or suggest a clearer explanation.
Gladia is a French artificial-intelligence company that builds audio infrastructure for developers and voice-product teams, centered on a speech-to-text (transcription) API. Founded in 2022 and headquartered in Paris, the company offers both asynchronous (batch) and real-time streaming transcription across more than 100 languages, along with translation, speaker diarization, and audio-intelligence features such as summarization, sentiment analysis, and named-entity extraction. Gladia's API first launched on top of an optimized version of OpenAI's Whisper model in 2023, and the company has since shipped its own proprietary models, including the multilingual Solaria-1 system released in 2025. Gladia positions its product for use cases like meeting recorders, contact centers, and AI voice agents, and it competes with Deepgram, AssemblyAI, Speechmatics, and the speech APIs of the large cloud providers.
Gladia was founded in 2022 by Jean-Louis Quéguiner (CEO) and Jonathan Soto (CTO). Before starting the company, Quéguiner was the group vice president for data, AI, and quantum computing at OVHcloud, one of Europe's largest cloud providers, where he led the company's machine-learning efforts. He holds a master's degree in symbolic AI from the Université du Québec in Canada and from Arts et Métiers ParisTech in Paris.
Quéguiner has said the company grew out of a personal frustration: existing transcription services struggled to understand his French accent, which he attributed to models trained predominantly on English-language audio. In interviews he framed Gladia's mission as making advanced speech AI accessible to any developer and removing the language and accent biases common in earlier systems. The company name is styled in lowercase as "gladia" in some of its branding, and the legal entity operates as Gladia SAS.
Gladia has raised roughly $20.3 million in disclosed funding across a seed round and a Series A. The brand of investors skews toward European and US venture firms, with Sequoia Capital backing the seed round and the Franco-German fund XAnge leading the Series A.
| Round | Date | Amount | Lead investor | Other investors |
|---|---|---|---|---|
| Seed | June 2023 | $4 million | New Wave | Sequoia Capital, Cocoa, GFC, and angels including Solomon Hykes, Pierre Betouin, Miroslaw Klaba, and Alexandre Berriche |
| Series A | October 2024 | $16 million | XAnge | Illuminate Financial, XTX Ventures, Athletico Ventures, Gaingels, Mana Ventures, Motier Ventures, Roosh Ventures, Soma Capital, New Wave (returning) |
The $4 million seed round was announced in June 2023, led by New Wave with participation from Sequoia Capital, Cocoa, and several technology angels, among them Docker founder Solomon Hykes. The company used the round to expand its transcription API beyond raw speech-to-text into features like translation, summarization, and content categorization.
In October 2024, Gladia closed a $16 million (about 14.7 million euro) Series A led by XAnge, with a broad syndicate of European and American investors and the return of seed backer New Wave. The company announced the round alongside the launch of its real-time transcription engine, framing real-time processing as "the next frontier" for audio APIs. At the time of the Series A, Gladia reported more than 70,000 users and over 600 enterprise customers, and it told reporters it was operating with positive margins. As of mid-2025 the company had not publicly announced a later round.
Gladia sells an API rather than a consumer application. Developers send audio (uploaded files or live streams) and receive structured transcripts plus optional analytics. The product is organized around two transcription modes and a set of audio-intelligence add-ons.
The original product is an asynchronous (batch) speech-to-text API that transcribes uploaded audio and video files. At launch in 2023 it could process roughly an hour of audio in about 60 seconds, which the company contrasted with competing APIs that it said could take more than 15 minutes for the same file. Output is available in JSON, SRT, and VTT formats, and the API supports automatic punctuation and casing, language detection, timestamps, and speaker diarization (separating and labeling different speakers).
Gladia introduced a real-time streaming transcription product alongside its Series A in October 2024. The real-time engine targets latency under 300 milliseconds and is aimed at live use cases such as voice agents, contact-center assistance, and live captioning. It is designed to integrate with telephony and voice stacks, with the company citing compatibility with SIP, VoIP, FreeSWITCH, and Asterisk. The company's pitch is to deliver "batch quality with real-time capabilities," closing the accuracy gap that has historically existed between streaming and offline transcription.
On top of transcription, Gladia offers what it calls audio intelligence: generative-AI features that turn a transcript into structured insight. These include summarization, sentiment analysis, key-information and named-entity recognition, translation across supported languages, and PII (personally identifiable information) redaction. The features are aimed at contact centers and sales teams that want to extract data such as names, addresses, and account numbers from calls, and at meeting tools that want automatic notes and action items.
Gladia's first API was built on OpenAI's open-source Whisper model, which the team modified to run faster and to reduce the "hallucinations" the base model is prone to. Quéguiner gave a concrete example of the problem: Whisper, trained heavily on web video, would sometimes invent phrases like "if you enjoyed this video, please like and subscribe" when transcribing silence or noise. Gladia's engineering focused on inference speed and on suppressing these fabricated outputs while keeping the multilingual coverage that made Whisper attractive.
Over time the company moved from an optimized-Whisper approach toward its own proprietary models, which it markets as offering domain-specific terminology handling and contextual accuracy that a general-purpose open model does not provide out of the box. This shift culminated in the Solaria model line.
In April 2025, Gladia launched Solaria-1, a multilingual speech-to-text model it described as a next-generation system for global, real-time voice applications. Solaria supports 100 languages, including a number of high-population and regional languages that the company says are underserved by other models, such as Tagalog, Bengali, Punjabi, Tamil, Urdu, Persian, Marathi, and Hebrew, as well as smaller languages like Haitian Creole, Maori, Javanese, and Malagasy.
| Solaria-1 metric | Figure (company-reported) |
|---|---|
| Languages supported | 100 |
| Languages the company says are unique/underserved | 42 |
| English word accuracy rate | ~94% |
| Time to first transcription | ~270 ms |
| Final transcription latency | ~698 ms |
Gladia's published benchmarks claim Solaria delivers, on average, about 29% lower word error rate (WER) on conversational speech and roughly 3x lower diarization error rate (DER) than competing systems, drawing on a test set the company describes as 74 hours of audio across seven datasets. The company reports an English word accuracy rate near 94%, time to first transcription around 270 milliseconds, and final transcription latency near 698 milliseconds, and it compares these figures favorably against Deepgram and Speechmatics. As with most vendor benchmarks, these numbers are self-reported and reflect Gladia's own evaluation methodology; real-world accuracy varies with audio quality, accents, overlapping speakers, and domain jargon.
Solaria adds real-time code-switching (handling multiple languages within a single conversation), real-time translation, custom vocabulary, and named-entity recognition. Gladia announced an integration with the open-source voice framework LiveKit to let developers build multilingual voice agents on top of the model.
Gladia operates in the speech-to-text API market alongside several well-funded competitors. Its most direct rivals are developer-focused transcription companies: Deepgram and AssemblyAI in the United States, and Speechmatics in the United Kingdom. It also competes indirectly with the speech APIs of Amazon, Microsoft, and Google, and with OpenAI's Whisper (both the open model and OpenAI's hosted transcription endpoints), as well as adjacent voice players such as ElevenLabs.
Gladia's differentiation emphasizes multilingual breadth and accent robustness, low-latency real-time streaming, and bundling audio-intelligence features into its base pricing rather than charging separately for them. The company has historically competed on price as well: at the 2023 launch it quoted roughly $0.61 per hour of audio, against the $1.50 to $2 per hour it said competitors charged at the time. In benchmark comparisons published by various parties, Deepgram is typically cited as a leader in low-latency real-time streaming and AssemblyAI in transcript intelligence (via its LeMUR framework for applying large language models to transcripts), while Gladia leans on conversational and multilingual accuracy as its core claim.
Gladia's API became publicly available in the summer of 2023. By the time of its October 2024 Series A the company reported more than 70,000 users and over 600 enterprise customers, and by the April 2025 Solaria launch it cited more than 700 customers. Named customers and partners across its announcements have included Attention, Circleback, Coconote, Method Financial, Recall, Sana (Sana Labs), Ausha, VEED.IO, Claap, Livestorm, and Selectra. These span meeting-assistant apps, podcast and media tools, sales-enablement products, and developer platforms that embed transcription into their own offerings.
The company maintains enterprise compliance certifications including GDPR, HIPAA, SOC 2 Type II, and ISO 27001, reflecting its focus on regulated sectors such as healthcare, finance, and contact centers.