Superwhisper
Last reviewed
May 24, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v1 · 3,317 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 24, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v1 · 3,317 words
Add missing citations, update stale details, or suggest a clearer explanation.
Superwhisper is a system-wide voice-to-text dictation application for macOS, Windows, and iOS, built by SuperUltra, Inc., a bootstrapped Toronto company founded by Neil Chudleigh.[1][2] The app runs OpenAI's whisper speech-recognition models locally on the user's device, layering an optional language-model post-processing stage that cleans up, reformats, or translates the transcribed text.[3][4] Its defining feature is a "modes" architecture in which each mode bundles a speech model, a language model, a system prompt, and a set of actions tuned for a specific context such as coding, email, or chat.[4][5] Released in August 2023, Superwhisper has become one of the most widely used AI dictation tools in the Apple developer community, with paying users at companies including Meta, OpenAI, Coinbase, and Dropbox.[1] It competes with MacWhisper, Aiko, Whispering, and Wispr Flow in the emerging market for speech recognition productivity tools.[6][7]
| Developer | SuperUltra, Inc. (Toronto, Ontario)[1] |
| Founder | Neil Chudleigh[2] |
| Initial release | 21 August 2023 (macOS)[3] |
| iOS release | 17 April 2024[8] |
| Windows release | 26 November 2025[9] |
| Current macOS version | 2.14.0 (May 2026)[4] |
| Platforms | macOS 13+, Windows 10/11, iOS 18+[10][9] |
| License | Proprietary; free tier with paid Pro tier[11] |
| Local engine | whisper.cpp; NVIDIA Parakeet (via NeMo)[12][13] |
Neil Chudleigh, a co-founder of the Toronto-based affiliate-marketing platform PartnerStack, started building Superwhisper in mid-2023 as a side project.[1][2] He had become frustrated with the accuracy and reliability of Apple's built-in macOS dictation, which required explicit punctuation commands and worked inconsistently across applications.[3] After OpenAI released the Whisper speech-recognition models in 2022, Georgi Gerganov's whisper.cpp port made it possible to run the large models efficiently on Apple Silicon hardware, executing the encoder on the Apple Neural Engine for substantial speedups.[14] Chudleigh combined whisper.cpp with a small menu-bar app that captured audio via a global keyboard shortcut, ran the Whisper model locally, and pasted the resulting text into whichever application held focus.[3]
The app was posted to Hacker News as "Show HN: superwhisper" on 21 August 2023.[3] The initial release supported the standard Whisper model sizes (tiny, base, small, medium, and large-v2), was English-only at launch, and offered the smaller models for free while gating the larger and multilingual variants behind a paid tier.[3] The Product Hunt launch followed shortly after, attracting 234 upvotes.[15]
The two features that distinguished Superwhisper from generic Whisper wrappers arrived in late 2023. Version 1.17.0, released on 24 November 2023, introduced experimental language-model post-processing, letting the app feed raw transcripts into a chat model with a user-defined prompt.[4] Version 1.19.0, released on 10 December 2023, generalized that idea into the modes feature: a mode bundles a speech model, an optional post-processing language model, a system prompt, a target language, and an output action (paste, copy, translate, etc.) into a single configuration that the user can switch between with a hotkey.[4]
In January 2024, version 1.23.0 added cloud language-model providers, allowing Pro users to route post-processing through hosted models from OpenAI, Anthropic, and Groq in addition to running smaller LLMs locally.[4][7] Subsequent versions expanded the speech-model lineup to include Whisper large-v3 and the faster Whisper large-v3-turbo, plus distilled variants and on-device LLMs such as Llama, Mistral, Phi, and DeepSeek for users who wanted both transcription and post-processing to remain on-device.[7][4]
Superwhisper for iOS launched on 17 April 2024, primarily as a custom keyboard extension that any other app can invoke through the iOS keyboard switcher.[8] Holding the keyboard's record button captures audio, releases it, and pastes the transcribed text into the active field, mimicking the macOS hold-to-talk workflow.[10] Apple's sandboxing rules constrain what iOS keyboards can do, so the iOS app exposes a smaller subset of modes than the macOS version, runs a compact Whisper model on-device for low latency, and routes more capable post-processing through optional cloud models.[10][16] The iOS app is rated 4.4 out of 5 across 773 reviews on the United States App Store and is published by SuperUltra, Inc.[10]
A Windows version had been the most-requested feature on the company's public feedback board since 2024.[17] SuperUltra announced internal Windows development on 17 February 2025, ran a closed beta during the spring, and shipped version 1.0.0 to the public on 26 November 2025.[9] The Windows build preserves the macOS feature set where the platform allows, including global hotkeys, mode switching, on-device Whisper inference, and Parakeet support; integration with iOS-only Apple frameworks (Services menu, Apple Shortcuts) does not apply, and accessibility-API context capture uses Windows UI Automation rather than the macOS accessibility framework.[9][4]
The company was incorporated as SuperUltra, Inc. and is bootstrapped, with no outside investment as of May 2026.[7][1] By early 2026 Chudleigh told The Globe and Mail the company had "hundreds of thousands of weekly active users" and "seven-figure" annual revenue, hired its first employee in August 2025, and operated with six full-time staff plus five contractors out of Toronto.[1] Customers cited by the company included engineers, designers, and writers at Meta, OpenAI, Coinbase, and Dropbox.[1]
Superwhisper's macOS engine is built on whisper.cpp, Georgi Gerganov's C/C++ port of the Whisper model that runs efficiently on consumer hardware and uses Core ML to dispatch the encoder onto the Apple Neural Engine on Apple Silicon Macs.[14] The app ships multiple Whisper sizes branded as Fast, Nano, Standard, Pro, Ultra V3, and Ultra V3 Turbo, corresponding roughly to the tiny, base, small, medium, large-v3, and large-v3-turbo variants of the upstream model.[7] Whisper large-v3-turbo is Superwhisper's recommended default; the upstream model runs roughly eight times faster than large-v3 with a word-error-rate difference of about 0.39 percentage points, making it usable for interactive dictation on a recent MacBook.[18]
In 2025 Superwhisper added support for NVIDIA's Parakeet model family. Parakeet is a CTC/RNN-T-based ASR model trained by the NVIDIA NeMo team that, on benchmarks, runs roughly an order of magnitude faster than Whisper large-v3 with comparable accuracy on English audio.[13] Superwhisper's macOS changelog records Parakeet's debut in version 2.0.0 on 10 July 2025, initially as an English-only model for Pro users; a multilingual Parakeet V3 model became available shortly after NVIDIA's release.[4] On Windows, Parakeet was promoted from experimental to general release in version 1.2.5 on 19 February 2026.[9]
The official documentation lists "Distil-Whisper" variants alongside Whisper and Parakeet as on-device options, referring to the distilled Whisper models published by Hugging Face that trade a small accuracy loss for substantially smaller model size and lower memory usage on resource-constrained devices.[7] Users who add their own API keys can also point the speech stage at hosted ASR services including Deepgram Nova and ElevenLabs Scribe.[7]
A mode in Superwhisper is a named pipeline that specifies, at minimum, which speech model to use, which language to expect, and which output action to apply.[5] Optional fields include a language-model post-processor (local or cloud), a system prompt that instructs the language model how to rewrite the transcript, a list of vocabulary additions or replacements (for example, mapping the spoken phrase "my email" to a literal address), and the applications in which the mode should auto-activate.[5][7] Each mode is bound to a hotkey, and the user can switch between modes from a Spotlight-style command bar, a menu-bar list, an Apple Shortcuts action, or the Raycast or Alfred extensions.[16][19][20]
The default modes that ship with Superwhisper include Voice (raw transcription with light cleanup), Message (casual rewriting suitable for chat applications), Email (formal rewriting with greetings and sign-offs), and Super, the flagship context-aware mode introduced in 2024 that gathers screen and clipboard context before invoking the language model.[21][22] Users can create unlimited additional modes on the Pro tier; the free tier is capped at three custom modes.[11]
Super mode goes beyond a fixed system prompt by gathering three categories of contextual information at recording time and feeding them to the post-processing language model.[21][22] The first is application context: the name of the active app, the contents of the focused text field, and system data such as the current date and the user's name. The second is selected text: any highlighted text in the active app is treated as the editing target. The third is recent clipboard content within a three-second window before the dictation began, intended for cases where the user copies something to be referenced or transformed.[21] All three categories are pulled through accessibility APIs and require the user to grant the corresponding macOS permission.[21][22] In effect, Super mode lets the language model rewrite an existing draft, reply to a message in the appropriate register, or insert content at a specific location, rather than producing a free-standing transcript.
When a mode is configured to use a hosted language model, Superwhisper sends the transcribed text (and, for Super mode, the gathered context) to the user's selected provider. Supported providers include OpenAI (GPT family models), Anthropic (Claude family models), Google Gemini, xAI Grok, Mistral, and Llama models served by Groq or Together AI.[7][4] Users may either subscribe to Superwhisper Pro and use the company's bundled access to these providers or bring their own API keys; in the bring-your-own-key configuration, requests bypass SuperUltra's servers entirely and go directly from the user's machine to the provider.[7]
Audio itself is never sent to a cloud provider when an on-device speech model is selected, which is the configuration the company markets as the default privacy-preserving path.[11][22] On free tier installations, recording history is stored locally in the user's home directory; the Pro tier allows configuring the storage path and includes optional iCloud sync, but does not upload audio to SuperUltra-controlled servers as of the May 2026 documentation.[22]
The primary interaction is a global push-to-talk shortcut, default Fn, that records audio while held and transcribes on release.[22][16] On macOS the resulting text is auto-pasted into the focused application via the system Services API; on iOS the keyboard pastes into the focused text field.[16][10] Hands-free toggle-style recording is also supported for longer dictations such as meeting notes.[11]
superwhisper://record?mode=... allow Shortcuts and other automation tools to switch modes and start recording in one action.[16]Each mode has a vocabulary list that biases the speech model toward user-specified terms (product names, jargon, colleague names) and a separate replacements list that performs literal substitutions on the transcribed text, for example expanding "my address" into a literal mailing address.[9][7] Parakeet's keyword-recognition feature is wired into the same vocabulary system on platforms where Parakeet is available.[13]
The Whisper-based pipeline supports the 99 languages that upstream Whisper was trained on, plus optional translation of any source language into English; the multilingual Parakeet V3 model covers a smaller set of languages but runs faster.[11][13] The user can pin a mode to a specific language or leave it on automatic detection.[5]
Superwhisper uses a freemium model.[11] The free tier provides unlimited use of the smaller on-device Whisper models, up to three custom modes, and a 15-minute trial of Pro features.[11][7] Pro is sold at three price points: $8.49 per month, $84.99 per year, or $249.99 as a one-time lifetime purchase.[11][10] A single Pro license covers macOS, Windows, and iOS for the same user.[11][10] An Enterprise tier with custom pricing, single sign-on, and SOC 2 Type II compliance is offered to larger organizations.[11] All paid plans carry a 30-day refund guarantee.[11]
The lifetime tier was retained throughout 2024-2026 even as comparable indie productivity apps moved away from one-time purchases. Independent reviews note that the $249.99 lifetime price breaks even against the $84.99 annual plan at roughly the three-year mark, making it cost-effective for users committed to the product long-term.[7][23]
Reviews of Superwhisper have generally been positive in the developer-oriented Mac press. The independent reviewer at Today on Mac in 2024 described the app as "the AI dictation app that listens and learns" and highlighted the modes feature as a meaningful improvement over generic Whisper wrappers.[24] A 2026 Voibe Resources review measured 95-96% raw transcription accuracy on a standard 500-word test passage using Whisper large-v3 with a USB microphone, calling the result "impressive for a fully offline tool with no cloud processing."[7] The same review described the mode system as "best-in-class" and gave the product 7.5 out of 10 overall, criticizing default-on local audio recording and plaintext API key storage as friction points that require manual hardening after install.[7]
The Globe and Mail profiled SuperUltra in early 2026, citing the company's bootstrapped status, seven-figure revenue, and ambition to "make the keyboard obsolete."[1] Superwhisper won the Privacy Award for AI Dictation Apps in Product Hunt's Winter 2025 awards.[6][25]
Andrej Karpathy, the former OpenAI and Tesla machine-learning researcher, mentioned Superwhisper several times during 2024-2025 as part of his "vibe coding" workflow, in which the user dictates natural-language instructions to an AI coding assistant rather than typing code.[26] That association raised the app's visibility within the AI-engineering community and contributed to its adoption inside OpenAI and at AI startups according to The Globe and Mail's reporting.[1][26]
The market for AI-powered desktop dictation tools grew rapidly between 2023 and 2026.[6][27] Superwhisper sits in a category alongside MacWhisper, Aiko, Whispering, VoiceInk, and Wispr Flow, but the products differ along several axes.
| Product | Developer | First release | Local speech | Cloud LLM post-processing | Platforms | Lifetime price |
|---|---|---|---|---|---|---|
| Superwhisper | SuperUltra, Inc. (Neil Chudleigh) | Aug 2023[3] | Whisper, Parakeet, Distil-Whisper[7] | Yes (BYOK or bundled)[7] | macOS, Windows, iOS[9] | $249.99[11] |
| MacWhisper | Good Snooze (Jordi Bruin) | Feb 2023[28] | Whisper (incl. large-v3-turbo)[28] | Optional via integrations[28] | macOS[28] | $79.99 (App Store)[29] |
| Aiko | Sindre Sorhus | 2022[30] | Whisper large-v2 only[30] | No[30] | macOS, iOS[30] | Free[30] |
| Wispr Flow | Wispr AI (Tanay Kothari) | macOS 2024[31] | No (cloud)[29] | Yes (proprietary cloud)[31] | macOS, Windows, iOS, Android[29] | None (subscription only)[29] |
MacWhisper, written by Amsterdam-based indie developer Jordi Bruin and launched in early 2023, focuses on file transcription rather than system-wide dictation: users drop in audio or video files and receive transcripts with speaker diarization, full-text search, and YouTube export.[28][29] It does ship a system-wide dictation feature in the Pro tier, but the product's center of gravity is offline batch transcription, and it is macOS-only as of 2026.[29] Pricing for MacWhisper is $29 from Gumroad or $79.99 from the App Store for the Pro tier, one-time.[29]
Aiko, by Apple-developer-community figure Sindre Sorhus, is a free macOS and iOS Whisper wrapper that pioneered on-device Whisper UX in 2022 but ships only the large-v2 model and is positioned for file transcription rather than real-time dictation.[30] Independent comparisons note it is "too slow for real-time use" because of the single large model.[30]
Wispr Flow, made by Wispr AI in South San Francisco and founded by Tanay Kothari and Sahaj Garg, is Superwhisper's most direct competitor in real-time dictation.[31][32] Wispr Flow does not use OpenAI's Whisper; it runs proprietary cloud models hosted on its own infrastructure with audio sent off-device.[29] Wispr AI raised a $30 million Series A in June 2025 led by Menlo Ventures, with participation from NEA and 8VC, followed by a $25 million Series A extension in November 2025 led by Notable Capital, bringing total funding to about $81 million.[32][33] Reporting in May 2026 suggested the company was in advanced talks for a $260 million round at roughly a $2 billion valuation.[34] The contrast in business models, Wispr Flow as a venture-backed cloud platform versus Superwhisper as a bootstrapped on-device app, is frequently cited in product comparisons.[7][29]