| ElevenLabs Inc. | |
|---|---|
| Type | Private |
| Industry | Artificial intelligence, voice synthesis, audio AI |
| Founded | 2022 |
| Founders | Mati Staniszewski (CEO) Piotr Dabkowski (CTO) |
| Headquarters | London, United Kingdom; New York, United States |
| Key people | Mati Staniszewski (CEO) Piotr Dabkowski (CTO) |
| Products | Text-to-speech models (Eleven v3, Multilingual v2, Flash v2.5, Turbo v2.5) Instant and Professional Voice Cloning Voice Library AI Dubbing Studio Conversational AI (AI Voice Agent) platform Sound Effects Eleven Music Reader app Audio Native Voice Isolator Voice Changer (Speech-to-Speech) |
| Revenue | $330+ million ARR (end of 2025)[1] |
| Funding | $781 million across five rounds (Seed through Series D)[2] |
| Valuation | $11 billion (February 2026)[3] |
| Website | elevenlabs.io |
ElevenLabs is a voice and audio artificial intelligence company that develops text-to-speech AI, voice cloning, AI dubbing, generative sound effects, music synthesis, and conversational voice agent technology. Founded in 2022 by two friends from Warsaw, Mati Staniszewski and Piotr Dabkowski, the company has become one of the fastest growing AI startups in Europe and is widely regarded as the leading provider of synthetic voice technology for publishers, game developers, accessibility groups, and enterprise contact centers.[4] By February 2026 the company had raised roughly $781 million in venture capital across five rounds, reaching an $11 billion valuation in a Series D led by Sequoia Capital with participation from Andreessen Horowitz, ICONIQ Growth, and Nvidia, and reported more than $330 million in annualized revenue.[5][6]
ElevenLabs is best known for the lifelike quality of its generated voices and for the breadth of its product line, which includes the Eleven v3 expressive speech model, ultra low latency Flash and Turbo models, instant and professional voice cloning, a public Voice Library with more than ten thousand community voices, an AI dubbing pipeline that re-voices video into dozens of languages while preserving the original speaker, and a conversational AI agent platform that competes with offerings from OpenAI, Google, and a wave of pure play voice startups.[7] The company also operates a consumer Reader application and a publisher tool called Audio Native that turns blog posts and articles into AI narrated audio. Its rapid rise has been accompanied by visible safety controversies, most notably an AI generated robocall impersonating United States President Joe Biden during the 2024 New Hampshire primary, which prompted ElevenLabs to ban the responsible account and to expand its voice verification, watermarking, and detection systems.[8]
Mati Staniszewski and Piotr Dabkowski met as classmates at a high school in Warsaw, Poland, and stayed in close contact through university and into their first jobs. Dabkowski studied computer science at the University of Oxford and the University of Cambridge, published machine learning research with hundreds of citations, and worked as a machine learning engineer at Google before turning to voice synthesis full time.[9] Staniszewski studied mathematics at Imperial College London, then worked in product and deployment roles at the financial software firm BlackRock and at the data analytics company Palantir Technologies. Both founders have publicly described their motivation for starting the company as the experience of growing up watching American films dubbed badly into Polish, where flat readings and mismatched voices flattened the emotional range of the source material.[10]
The two co-founders began experimenting with neural speech models in 2021 and incorporated ElevenLabs in early 2022. The company name is a reference to the Spinal Tap joke about an amplifier that goes "to eleven," which the founders chose to symbolize a step beyond what existing speech synthesis tools could do.[11] The first public products, an English text-to-speech model and an Instant Voice Cloning tool, were released in beta in late 2022 and quickly attracted attention from amateur voice actors, audiobook producers, and the modding community for video games. By the early months of 2023 the platform was generating hundreds of hours of audio per day and had become a viral sensation in technology newsletters because of how convincingly its synthetic voices reproduced subtle prosody, intonation, and accents that earlier speech engines flattened.[12]
In June 2023, ElevenLabs announced a $19 million Series A round at roughly a $100 million valuation, co-led by Andreessen Horowitz, the former GitHub chief executive Nat Friedman, and the entrepreneur Daniel Gross.[13] The round, which followed an earlier seed financing led by Credo Ventures and Concept Ventures, gave the company the resources to expand its research team and to push beyond English language synthesis. The same month, ElevenLabs released its multilingual v1 model, which extended high quality speech generation to seven additional languages, and announced a strategic partnership with the Swedish audiobook subscription service Storytel that introduced an AI "VoiceSwitcher" feature to the Storytel app.[14] The Storytel deal was widely cited at the time as one of the first commercial AI narration arrangements struck by a major audiobook publisher.
In January 2024, ElevenLabs raised an $80 million Series B round that was again led by Andreessen Horowitz, with Sequoia Capital, Nat Friedman, and Daniel Gross participating, valuing the company at $1.1 billion and making it one of the youngest European AI unicorns at the time.[15] The Series B announcement was paired with the launch of two new products: the AI Dubbing Studio, an editor for re-voicing long form video into multiple languages while preserving the original speaker's voice and timing, and a self serve Voice Library where verified creators could upload professional clones of their voices and earn royalty payments when others used them.[16]
ElevenLabs closed a $180 million Series C round on January 30, 2025, co-led by ICONIQ Growth and Andreessen Horowitz with participation from Sequoia Capital, NEA, World Innovation Lab, Valor Equity Partners, Endeavor Catalyst, and Lunate, plus strategic investments from Deutsche Telekom, LG Technology Ventures, HubSpot Ventures, NTT DOCOMO Ventures, and RingCentral Ventures.[17] The round priced the company at $3.3 billion, roughly tripling the Series B valuation, and brought total funding to about $281 million. In the announcement, the company said annualized revenue had grown more than threefold over the previous year and that more than 41 percent of Fortune 500 companies were using its products in some capacity.[18]
The Series C funding was used to scale the team toward roughly 580 employees by the end of 2025, to expand offices beyond London and New York, and to invest in custom inference infrastructure for the conversational AI agent platform that the company had launched in late 2024.[19]
On February 4, 2026, ElevenLabs announced a $500 million Series D financing led by Sequoia Capital, with participation from Andreessen Horowitz, ICONIQ Growth, Lightspeed Venture Partners, Bond, Evantic Capital, and Nvidia, valuing the company at $11 billion.[20] At the time of the announcement, ElevenLabs had reached more than $330 million in annual recurring revenue, up from roughly $90 million a year earlier, with enterprise revenue alone growing more than 200 percent year over year.[21] In interviews surrounding the round, Staniszewski confirmed that the company was building toward an initial public offering, while declining to commit to a specific timeline.[22]
| Round | Date | Amount | Lead investors | Post-money valuation |
|---|---|---|---|---|
| Seed | January 2023 | $2 million | Credo Ventures, Concept Ventures | Undisclosed |
| Series A | June 2023 | $19 million | Andreessen Horowitz, Nat Friedman, Daniel Gross | ~$100 million |
| Series B | January 2024 | $80 million | Andreessen Horowitz, Sequoia Capital | $1.1 billion |
| Series C | January 2025 | $180 million | ICONIQ Growth, Andreessen Horowitz | $3.3 billion |
| Series D | February 2026 | $500 million | Sequoia Capital | $11 billion |
ElevenLabs sells access to its models through both a self serve web application aimed at individual creators and a developer API used by enterprise customers and software vendors. The product surface has expanded steadily since the company's founding and now spans speech synthesis, voice cloning, dubbing, sound effects, music, conversational agents, and consumer reading applications.[23]
The company's core business is its family of text-to-speech models, each tuned for a different point on the trade off curve between expressive quality, language coverage, and latency. The model family is exposed through a single API and is also selectable from the user interface of the Studio editor and the Reader application.
| Model | Release | Languages | Best for |
|---|---|---|---|
| Eleven Monolingual v1 | 2022 | English only | Early English narration, hobbyist projects |
| Eleven Multilingual v1 | June 2023 | 8 languages | First multilingual launch, replaced by v2 |
| Eleven Multilingual v2 | November 2023 | 29 languages | Polished long form narration with consistent emotion |
| Eleven Turbo v2 / Turbo v2.5 | 2024 | 32 languages | Real time applications, predecessors of Flash |
| Eleven Flash v2 / Flash v2.5 | 2024 | 32 languages | Ultra low latency under 75 ms, voice agents and live use |
| Eleven v3 (alpha) | June 2025 | 70+ languages | Most expressive model, audio tags, multi speaker dialogue |
Eleven Multilingual v2 remains the workhorse for long form narration such as audiobooks, where consistent voice quality across hours of content matters more than absolute latency. The Flash v2.5 and Turbo v2.5 models are designed for live and near real time use, with Flash generating speech in under roughly 75 milliseconds, which makes it suitable for voice agents and interactive applications.[24] Eleven v3, released in alpha in June 2025, introduced inline audio tags such as [whispers], [laughs], [excited], and [sighs] that let prompts directly steer the emotional delivery of synthesized speech, support for over 70 languages, and a multi speaker dialogue mode.[25]
ElevenLabs provides two main voice cloning paths.
Instant Voice Cloning, generally referred to as IVC, lets a user upload as little as one minute of clean audio and within seconds receive a usable cloned voice. IVC is intended for fast prototyping, hobby projects, and short form content, and is included in the lower paid tiers.
Professional Voice Cloning, or PVC, requires roughly thirty minutes or more of high quality recording and uses a longer training process to produce a voice that is much closer to studio quality. PVC is recommended for audiobook narration, broadcast voiceover, and any application where the cloned voice will be the primary speaker over hours of content.[26] To use PVC, a user must complete a Voice Captcha step in which they read out specific dynamically generated text on camera or microphone within a time limit, designed to verify that the person who owns the account also owns the voice being cloned. Failed captcha attempts trigger manual human review.[27]
In addition to private voice clones, ElevenLabs operates a public Voice Library with more than ten thousand voices contributed by verified creators. Voice owners can opt into a payouts program that compensates them with usage based royalties whenever other ElevenLabs customers generate audio with their voice.[28]
The AI Dubbing Studio is a long form video and audio editor that automatically transcribes a source clip, separates speakers, translates the script, and re-synthesizes the dialogue in a target language while attempting to preserve each speaker's own voice, accent, and emotional delivery. As of late 2025, dubbing supported the same 29 languages covered by Multilingual v2 and offered three voice modes: a clip clone that builds a fresh clone for each line, a track clone that builds one consistent clone for an entire speaker track, and a mode that draws on prebuilt voices from the Voice Library.[29] Studio (formerly Projects) is the surrounding editor that lets producers chunk long scripts into chapters, mix multiple voices in dialogue, and export to audiobook ready formats.
In November 2024, ElevenLabs launched a Conversational AI platform that lets developers build full voice agents combining ElevenLabs speech synthesis, third party large language models such as those from OpenAI, Anthropic, and Google, and the company's own turn taking and interruption detection logic.[30] An AI voice agent built on the platform can be deployed to a website widget, a mobile application via SDKs for Python, JavaScript, React, and Swift, or to a phone number through carrier integrations. In June 2025, the company released Conversational AI 2.0, which improved how agents handle interruptions, pause naturally during long answers, and avoid talking over the user.[31] By the time of the Series D announcement in February 2026, the company said its customers had created more than two million voice agents on the platform across customer support, scheduling, sales, training, and entertainment use cases.[32]
In June 2024, ElevenLabs released a generative Sound Effects model that produces short audio clips from a text prompt, intended for game developers, video producers, and podcast editors who want sourced sound design without licensing stock libraries.[33] In 2025 the company expanded its audio research with Eleven Music, a text to music model that generates studio quality musical pieces in many genres from natural language prompts. Eleven Music is positioned as cleared for nearly all commercial uses, having been built in collaboration with rights holders and music publishers.[34]
The consumer ElevenLabs Reader application launched on iOS in June 2024 and on Android the following year, and lets users listen to articles, PDFs, ePubs, and pasted text in any of the company's voices, in 32 languages, free of charge.[35] Audio Native is a separate publisher product that lets website owners embed an AI narrated player into their pages on platforms including WordPress, Webflow, and Squarespace, generating audio versions of new posts automatically.[36]
The Voice Isolator product strips background music, ambient noise, and overlapping speech from an audio file using a neural separation model, accepting WAV, MP3, FLAC, OGG, and AAC files up to 500 megabytes and one hour in length.[37] Voice Changer, originally branded Speech-to-Speech, takes an existing recording and re-renders it in a different cloned voice while preserving the original timing, emphasis, and emotional delivery, which is useful for fixing mispronunciations or for performing a script in another voice without re-recording.[38]
| Product | Launched | Audience | Description |
|---|---|---|---|
| Text to Speech | 2022 | Developers, creators | Family of TTS models from Multilingual v2 to Eleven v3 |
| Instant Voice Cloning | 2022 | Creators, hobbyists | Clone a voice from one minute of audio |
| Professional Voice Cloning | 2023 | Audiobook publishers, broadcasters | Studio quality clone from 30+ minutes of audio with Voice Captcha |
| Voice Library | 2024 | Creators | Marketplace of community voices with payouts to voice owners |
| AI Dubbing Studio | January 2024 | Studios, creators | Re-voices video into 29 languages while keeping each speaker's identity |
| Sound Effects | June 2024 | Game and video producers | Text to sound effect generator |
| Reader app | June 2024 | Consumers | iOS and Android app that reads any text aloud |
| Audio Native | 2023 | Publishers | Embeddable AI narration widget for websites |
| Voice Isolator | 2024 | Editors, podcasters | Neural background noise and music removal |
| Voice Changer (Speech to Speech) | 2023 | Creators, voice actors | Convert one voice into another while preserving delivery |
| Conversational AI agents | November 2024 | Enterprises, developers | Full stack platform for building voice and chat agents |
| Eleven Music | 2025 | Creators, brands | Text to music model cleared for commercial use |
Audiobook narration was the first major commercial use case for ElevenLabs voices, and remains a strategic priority for the company. The June 2023 partnership with Storytel introduced a VoiceSwitcher feature that let listeners pick from several AI voices for selected titles, with English support at launch and Swedish and Danish added later in 2023.[39] The platform's text-to-speech and voice cloning APIs are also widely used inside publishing houses for shorter form narration, audio summaries, and for back catalog titles where commissioning a human narrator would be uneconomic. The company has since announced its Eleven Music tooling and broader publisher product set, including the Audio Native widget that lets newspapers and book publishers embed AI narrated players directly on their websites.[40]
Game studios use ElevenLabs voices for placeholder dialogue during development, for low budget side characters, and for player generated content systems where commissioning human actors for every variant is impractical. Paradox Interactive has been cited by ElevenLabs as a customer, and modding communities have used Instant Voice Cloning to add new voiced dialogue to role playing games.[41] The Eleven v3 audio tag system, which lets prompts mark in dialogue events such as [laughs] or [whispers], is particularly useful for interactive entertainment where lines need to convey a wide range of emotional states without re-recording.[42]
In August 2024, ElevenLabs launched an accessibility program that gives free Professional Voice Cloning licenses to people who have lost or are losing their voices to medical conditions, beginning with people living with motor neuron disease and amyotrophic lateral sclerosis (ALS). The program later expanded to people affected by progressive supranuclear palsy, multiple sclerosis, stroke, oral and throat cancer, laryngectomy, Tay-Sachs disease, and other conditions that can take away a person's natural voice.[43] The company partnered with the United States non profit Bridging Voice to provide ALS patients with a free Pro voice clone license, valued at roughly $1,200 per year, so they can record their voice while still able and use it to speak through assistive devices later.[44] The British actor and writer Stephen Fry has been publicly cited as an example of a recognizable voice cloned for assistive use, illustrating the range of expression that synthetic voices can preserve when given a sufficiently large training corpus.[45]
The Conversational AI platform is used by enterprises to build voice and chat agents for customer support, scheduling, lead qualification, and internal training. ElevenLabs has named Deutsche Telekom and the European digital bank Revolut as enterprise customers in this category, and other publicly cited customers include The Washington Post, TIME, HarperCollins, and Paradox Interactive.[46] The company has said the enterprise agents business is now a primary growth driver and that revenue is approaching a roughly even split between enterprise contracts and self serve subscriptions.[47]
In the days before the January 23, 2024 Democratic primary in New Hampshire, an estimated 5,000 to 25,000 voters received automated phone calls in which an AI generated voice resembling that of United States President Joe Biden urged them not to vote in the primary, telling listeners to "save your vote" for the November general election.[48] Forensic analysis by the security firm Pindrop and by digital forensics researchers at the University of California, Berkeley both attributed the synthetic audio to ElevenLabs.[49] ElevenLabs identified and banned the responsible account, and the United States Federal Communications Commission unanimously ruled that AI generated voices in robocalls fall under existing laws against artificial or prerecorded voice messages without prior consent. The New Hampshire attorney general's office traced the origin of the calls to the Texas based companies Life Corporation and Lingo Telecom and to a political consultant named Steven Kramer, who was later indicted on voter suppression charges.[50]
In response to the Biden robocall and to broader concerns about AI generated impersonations and fraud, ElevenLabs has expanded its safety stack along several axes. The Professional Voice Cloning workflow requires the Voice Captcha biometric step described above, designed to make it harder to clone a voice using stolen recordings.[51] The company maintains a list of "No Go Voices" including political figures and other high risk public personas, attempts to clone of which are blocked at generation time. ElevenLabs has also released a publicly accessible AI Audio Classifier that lets users check whether a piece of audio is likely to have been generated by ElevenLabs, and has joined the Coalition for Content Provenance and Authenticity (C2PA) standard so that audio it produces can carry signed provenance metadata.[52] In 2024 the company partnered with the deepfake detection vendor Reality Defender to provide enterprise customers with real time detection of AI generated audio, including content not produced by ElevenLabs itself.[53]
Despite these measures, ElevenLabs continues to attract scrutiny from regulators, election integrity groups, and journalists who argue that any platform offering near instant voice cloning will inevitably be misused for political disinformation, fraud, and harassment, and that voluntary safeguards are insufficient.[54] The company's published prohibited use policy bans impersonation without consent, fraudulent activity, voter suppression, and the cloning of voices of public figures for malicious purposes, and the company says it cooperates with law enforcement on investigations of misuse.[55]
The market for AI voice generation grew rapidly between 2022 and 2026 as the underlying neural speech models improved. ElevenLabs competes with several distinct categories of players. In pure synthetic voice, its main rivals include Murf, which targets corporate video production, Resemble AI, which emphasizes real time speech to speech and on premise deployment, Cartesia, PlayHT, and WellSaid Labs.[56] OpenAI has integrated voice generation and conversational voice into its Realtime API and ChatGPT Voice product, and Google offers neural voices through Google Cloud and the Gemini Live experience. In conversational voice agents specifically, ElevenLabs competes with Vapi, Retell AI, LiveKit Agents, and a long tail of vertical specialists. In sound effects and music, the launch of Eleven Music puts the company alongside Suno, Udio, and other generative music startups.[57] ElevenLabs has differentiated primarily on perceived audio quality, breadth of language coverage, and the depth of its tooling for long form audio production, with publicly cited revenue figures suggesting that it is one of the largest standalone voice AI companies by revenue as of early 2026.[58]
ElevenLabs is headquartered in London with major offices in New York City, and operates as a remote first organization with employees across Europe, North America, and Asia. The company had grown to roughly 580 employees by the end of 2025.[59] Mati Staniszewski serves as chief executive officer and Piotr Dabkowski as chief technology officer. The company is incorporated in the United Kingdom as Eleven Labs Ltd. Its enterprise customers include media companies, gaming studios, financial services firms, telecommunications carriers, and a substantial share of the Fortune 500.[60]