Superwhisper Review (2026) — 98% Accuracy, Local & Cloud

Superwhisper Verdict

6.3

out of 10

Accuracy

Local

4 Accuracy — Local Out of the box: 4 / 10 Best model: 9 / 10

Cloud

2 Accuracy — Cloud Out of the box: 2 / 10 Best model: 8 / 10

Speed

Local

8 Speed — Local 8 / 10

Cloud

9 Speed — Cloud 9 / 10

UX

8.4 UX 8.4 / 10

Features

5.6 Features 5.6 / 10

Privacy

6 Privacy 6 / 10

How we score →

Powerful local engine buried under a broken cloud flagship

Superwhisper version 1.4.0 scores 6.3/10 overall in Voice-list independent testing (tested 2026-05-30). Best local model (Whisper Standard) achieves 2.0% aggregate WER across 6 recordings. Best cloud model (Ultra) achieves 2.4% aggregate WER.

Works well for

Whisper Standard (hidden): best-in-class local accuracy at 1.8% WER
Genuine offline mode — no audio leaves device in local configuration
Lifetime license option at $249.99 — no subscription required

Watch out for

S1-Voice (cloud flagship default) shows 15-37% WER across test recordings
Trial is 15 minutes total — blocks even offline models after limit
Cloud mode sends app name, clipboard, and focused text to Modal.com beyond audio

Best for

Power users willing to dig past default settings and switch to Whisper Standard

Not for

Anyone who installs and expects the default model to work well

Superwhisper Accuracy & Speed

		Model	Accuracy	Speed
English	Local	Parakeet Default 1.1 GB CPU Tested on CPU Ryzen AI 9 HX · 32 GB RAM NVIDIA Parakeet TDT 0.6B — default local model in Superwhisper 1.4. Fast, good on clean speech, but has no ITN: numbers and dates come out as spoken words. Default local model — users see this first	82.3% Word accuracy The share of words the model got right (100% − word error rate). 100% = every word correct. 17.7% WER Word Error Rate What % of words the model got wrong. 0% = every word correct. 13.7% CER Character Error Rate Same as WER but measured letter-by-letter. Usually lower than WER. 25% PER Punctuation Error Rate How accurately the model placed commas, periods, and other punctuation. 4 / 10	~3s 2–5s range Post-stop latency Seconds from pressing Stop to the final text appearing in your active app. Average across all test recordings. 8 / 10
	Local	Whisper Standard Best accuracy hidden in UI 500 MB CPU Tested on CPU Ryzen AI 9 HX · 32 GB RAM OpenAI Whisper large-v2 running locally. Best accuracy of all tested models but hidden from the main model picker — requires Library search to find. Hidden — Settings > Library > search "Whisper Standard"	98.0% Word accuracy The share of words the model got right (100% − word error rate). 100% = every word correct. 2.0% WER Word Error Rate What % of words the model got wrong. 0% = every word correct. 0.5% CER Character Error Rate Same as WER but measured letter-by-letter. Usually lower than WER. 35% PER Punctuation Error Rate How accurately the model placed commas, periods, and other punctuation. 9 / 10	~8s 6–12s range Post-stop latency Seconds from pressing Stop to the final text appearing in your active app. Average across all test recordings. 5 / 10
	Cloud	S1-Voice Default Superwhisper's proprietary cloud model, presented as the headline AI feature. Applies aggressive rewriting that causes large content losses on some recordings. Default cloud model — users land here without changing settings	74.2% Word accuracy The share of words the model got right (100% − word error rate). 100% = every word correct. 25.8% WER Word Error Rate What % of words the model got wrong. 0% = every word correct. 21.8% CER Character Error Rate Same as WER but measured letter-by-letter. Usually lower than WER. 67% PER Punctuation Error Rate How accurately the model placed commas, periods, and other punctuation. 2 / 10	~2s 1–4s range Post-stop latency Seconds from pressing Stop to the final text appearing in your active app. Average across all test recordings. 9 / 10
	Cloud	Ultra Best Cloud Cloud Whisper-class model offered as "Ultra" tier. More conservative post-processing than S1-Voice — accurate and stable across all recording types. Better than S1-Voice on every recording — less prominently surfaced in UI	97.6% Word accuracy The share of words the model got right (100% − word error rate). 100% = every word correct. 2.4% WER Word Error Rate What % of words the model got wrong. 0% = every word correct. 1.0% CER Character Error Rate Same as WER but measured letter-by-letter. Usually lower than WER. 40% PER Punctuation Error Rate How accurately the model placed commas, periods, and other punctuation. 8 / 10	~3s 2–5s range Post-stop latency Seconds from pressing Stop to the final text appearing in your active app. Average across all test recordings. 8 / 10
No models match — turn a filter back on.

Superwhisper for Coding & IT Best: Whisper Standard Ultra

Ultra Cloud

Coding 92.6% 17 err / 230w

Conference 99.5% 1 err / 220w

Coding

snake_case identifiers mostly preserved
CLI flags correct
No hallucinations

"Tauri" → "Tory"
"tokio::runtime" → "Toko runtime"

Conference

Best on accented English speaker — 0.45% WER
"Kubernetes", "PostgreSQL", "microservices" exact
Zero dropped sentences

"load balancer" → "load balance" (once)

Whisper Standard Local

Coding 92.7% 17 err / 233w

Conference 99.1% 2 err / 220w

Coding

snake_case identifiers intact
CLI flags like "--release" correct
No hallucinations

"Tauri" → "Tarry"
"axum" → "axm"

Conference

Strong on accented English
"PostgreSQL" and "Kubernetes" exact
Only 1 minor substitution across 89s

"schema" → "schema" (case flip once)

Parakeet Local

Coding 83.3% 39 err / 234w

Conference 98.2% 4 err / 222w

Coding

Handles prose segments cleanly
No hallucinations

"cargo.toml" → "Cargo .toml"
"tokio::spawn" → "tokio spawn"
"impl Trait" → "imp Trait"

Conference

Clean transcription of accented English speaker
Technical terms like "API" and "SDK" correct

"Kubernetes" → "Cubernetes"

S1-Voice Cloud

Coding 64.3% 81 err / 230w

Conference 75.5% 54 err / 220w

Coding

Faster response than local models (~2s)

"impl AsyncRead" → "RUMP_INSTAL"
Dropped entire code block (lines 14–17)
"CloudFace" hallucinated (not in source)

Conference

Fast turnaround even on longer clip

"distributed systems" → "destructive systems"
Dropped 3 full sentences mid-recording
Non-deterministic: 37% WER run 1 vs 16% run 2 — same audio

Superwhisper for Everyday & Long-form Best: Whisper Standard Ultra

Ultra Cloud

Casual 100.0% 0 err / 190w

Long-form 99.8% 1 err / 543w

Casual

Perfect: zero errors on casual speech
Natural disfluency handling

Long-form

Best long-form result of any engine tested — near-perfect over 4 minutes
No drift, zero hallucinations
All numbers, currency and percentages formatted correctly

One extra "and" — the only error in the whole recording

alternative wrong extra missing

okay ,so i want to walk through what we learned this quarter about organic search versus paid acquisition .because i think the numbers genuinely changed my mind .and i want the whole team aligned before we set the budget for next year .so ,quick context .for the last three years ,we have been spending roughly 70% of our marketing budget on paid channels .google ads ,a bit of meta ,some linkedin for the enterprise segment .the remaining 30% went into content and seo .and the assumption ,honestly ,was that paid is the reliable engine and seo is a slow ,nice-to-have thing on the side .turns out that framing was wrong .so let me give you the actual numbers .on paid ,our blended cost per acquisition climbed from $41 in january to $68 by september .that is a 66% increase in 9 months and nothing about our targeting changed .the auction just got more expensive .more competitors bidding on the same keywords plus the platform raising minimum bids .meanwhile ,our organic traffic went from about 12,000 sessions a month to 41,000 . and the cost per acquisition on that channel ,if you amortize the content investment ,was around $9 .9 versus 68 .that is not a small gap .now , the honest counter-argument is timing .paid converts today .you spend $1,000 on tuesday . you get leads on tuesday .seo is a delayed engine .the articles we published in february did not really start ranking until may or june . so there is a real cash flow difference .and if you are a startup that needs pipeline this month ,you cannot just turn off paid and wait two quarters for organic to compound .i get that .but here is the thing that surprised me .when we looked at lead quality ,not just volume ,the organic leads had a 31% higher trial-to-paid conversion rate .the theory is that someone who finds you by searching for a specific problem is further along in intent than someone who clicks an ad in their feed .they are actively looking .so not only is organic cheaper per lead ,the leads are actually better .so what are we doing differently next year ?three things .first ,we are flipping the ratio ,moving to roughly 50-50 between paid and organic over the next two quarters ,not all at once . because we still need the near-term pipeline .second ,we are doubling the content team from two writers to four . and we are focusing on what we call bottom of funnel comparison content ,because that is where the intent and the conversion rate are highest .and third ,we are going to treat paid as an accelerant for content that is already ranking instead of a standalone channel .so when an article hits page one organically ,we put paid behind it to compress the timeline .the goal by end of next year is to get our blended cost per acquisition back under $30 and to have organic driving more than half of all qualified pipeline .right now , it is at about 22% .that is a big gap to close ,but the trajectory over the last six months tells me it is achievable .anyway ,that is the short version .we can dig into the channel level breakdown in a separate session .

Whisper Standard Local

Casual 100.0% 2 err / 190w

Long-form 98.9% 8 err / 549w

Casual

Perfect: zero word errors
Disfluencies handled cleanly

Long-form

No drift over 4 minutes — consistent quality throughout
Zero hallucinations across full recording

"LinkedIn" split into "linked in"
"$9. 9 versus" collapsed into "$9.9 versus"
"counterargument" spelled as two words

alternative wrong extra missing

okay ,so i want to walk through what we learned this quarter about organic search versus paid acquisition because i think the numbers genuinely changed my mind and i want the whole team aligned before we set the budget for next year .so ,quick context .for the last three years ,we have been spending roughly 70% of our marketing budget on paid channels .google ads ,a bit of meta ,some linked in for the enterprise segment .the remaining 30% went into content and seo and the assumption ,honestly ,was that paid is the reliable engine and seo is a slow ,nice to have thing on the side .turns out that framing was wrong .so ,let me give you the actual numbers .on paid ,our blended cost per acquisition climbed from $41 in january to $68 by september .that is a 66% increase in nine months and nothing about our targeting changed .the auction just got more expensive ,more competitors bidding on the same keywords ,plus the platform raising minimum bids .meanwhile ,our organic traffic went from about 12,000 sessions a month to 41,000 . and the cost per acquisition on that channel ,if you amortized the content investment ,was around $9 $9.9 versus 68 .that is not a small gap .now , the honest counter argument is timing .paid converts today .you spend $1,000 on tuesday ,you get leads on tuesday .seo is a delayed engine .the articles we published in february did not really start ranking until may or june . so , there is a real cash flow difference .and if you are a startup that needs pipeline this month ,you cannot just turn off paid and wait two quarters for organic to compound .i get that .but here is the thing that surprised me .when we looked at lead quality ,not just volume ,the organic leads had a 31% higher trial to paid conversion rate .the theory is that someone who finds you by searching for a specific problem is further along in intent than someone who clicks an ad in their feed .they are actively looking .so , not only is organic cheaper per lead ,the leads are actually better .so , what are we doing differently next year ?three things .first ,we are flipping the ratio ,moving to roughly 50-50 between paid and organic over the next two quarters ,not all at once . because ,ah , we still need the near term pipeline .second ,we are doubling the content team from two writers to four . and we are focusing on what we call bottom of funnel comparison content . because that is where the intent and the conversion rate are highest .and third ,we are going to treat paid as an accelerant for content that is already ranking instead of a standalone channel .so , when an article hits page one organically ,we put paid behind it to compress the timeline .the goal by end of next year is to get our blended cost per acquisition back under $30 and to have organic driving more than half of all qualified pipeline .right now , it is at about 22% .that is a big gap to close . but the trajectory over the last six months tells me it is achievable .anyway ,that is the short version .we can dig into the channel level breakdown in a separate session .

Parakeet Local

Casual 100.0% 0 err / 197w

Long-form 98.6% 8 err / 559w

Casual

Perfect: zero word errors on casual speech
Disfluencies (um, uh) preserved naturally

Long-form

No quality drift over 4 minutes — consistent throughout
Numbers and currency mostly formatted correctly

"numbers genuinely" mangled into "numbinely"
"50/50" misheard as "550"
"accelerant" → "acceleration"; dropped "trial" and "through"

alternative wrong extra missing

okay ,so i want to walk through what we learned this quarter about organic search versus paid acquisition because i think the numbers numbinely changed my mind ,and i want the whole team aligned before we set the budget for next year .so quick context .for the last three years ,we have been spending roughly 70 % of our marketing budget on paid channels .google ads ,a bit of meta ,some linkedin for the enterprise segment .the remaining 30 % went into content and seo .and the assumption ,honestly ,was that paid is the reliable engine ,and seo is a slow nice to have thing on the side .turns out that framing was wrong .so let me give you the actual numbers .on paid ,our blended cost per acquisition climbed from 41 in january to 68 by september .that is a 66 % increase in nine months ,and nothing about our targeting changed .the auction just got more expensive .more competitors bidding on the same keywords ,plus the platform raising minimum bids .meanwhile ,our organic traffic went from about 12 , 000 sessions a month to 41 , 000 . and the cost per acquisition on that channel ,if you amortize the content investment was around $9 .9 versus 68 .that is not a small gap .now , uh the honest counterargument is timing .paid converts today .you spend a thousand dollars on tuesday ,you get leads on tuesday .seo is a delayed engine .the articles we published in february did not really start ranking until may or june . so there is a real cash flow difference .and if you are a startup that needs pipeline this month ,you cannot just turn off paid and wait two quarters for organic to compound .i get that .but here is the thing that surprised me .when we looked at lead quality ,not just volume ,the organic leads had a 31 % higher trial to paid conversion rate .the theory is that someone who finds you by searching for a specific problem is further along in intent than someone who clicks an ad in their feed .they are actively looking .so not only is organic cheaper per lead ,the leads are actually better .so what are we doing differently next year ?three things .first ,we are flipping the ratio ,moving to roughly 550 between paid and organic over the next two quarters ,not all at once . because we still need the near term pipeline .second ,we are doubling the content team from two writers to four . and we are focusing on what we call bottom of funnel comparison content ,because that is where the intent and the conversion rate are highest .and third ,we are going to treat paid as an acceleration for content that is already ranking instead of a standalone channel .so when an article hits page one organically ,we put paid behind it to compress the timeline .the goal by end of next year is to get our blended cost per acquisition back under 30 dollars ,and to have organic driving more than half of all qualified pipeline .right now , it is at about 22 % .that is a big gap to close . but the trajectory over the last six months tells me it is achievable .anyway ,that is the short version .we can dig into the channel level breakdown in a separate session .

S1-Voice Cloud

Casual 63.0% 71 err / 192w

Long-form 81.2% 105 err / 549w

Casual

Lost 2 of 3 sections — only opening paragraph survived
Heavy rewriting distorts meaning of what remains

Long-form

Completes the full recording without timeout

Catastrophic drift mid-recording — hallucinated a whole passage ("a range of results... the last one was a long time ago")
Dropped the entire "let me give you the actual numbers" section
Worst long-form result of any engine tested — 18.9% WER

alternative wrong extra missing

ok so i want to walk through what we learned this quarter about organic search versus paid acquisition .because i think the numbers genuinely changed my mind and i want the whole team aligned before we set the budget for next year .so quick context :for the last three years we have been spending roughly 70 percent of our marketing budget on paid channels .google ads a bit of meta ,some linkedin for the enterprise segment the remaining 30% went into content and seo .and the assumption honestly was that paid is the reliable engine and seo is a slow nice to have thing on the side .turns out that framing was wrong let me give you the actual numbers .on paid our blended cost per acquisition climbed from 41 in january to $68 by september .that is a 66% increase in 9 months .and nothing about our targeting changed .the auction just got more expensive :more competitors bidding on the same keywords plus the platform raising minimum bids .meanwhile our organic traffic went from about 12,000 sessions a month to 41,000 . and the cost per acquisition on that channel ? if you amortize the content investment was around nine dollars 9 versus $68 that is not a small gap now the honest counter argument is timing .paid converts today : you spend a thousand dollars on tuesday ; you get leads on tuesday .seo is a delayed engine .the articles we published in february did not really start ranking until may or june . so there's a range of results that are going to be the same for you . the last one was a long time ago . the first two weeks ago ,the last two weeks of the year was a week ago . and then the last one came out with the first two months ago . there's a real cash flow difference .and if you are a startup that needs pipeline this month ,you cannot just turn off paid and wait two quarters for organic to compound .i get that ! but here is the thing that surprised me : when we looked at lead quality not just volume ,the organic leads had a 31% higher trial-to-paid conversion rate .the theory is that someone who finds you by searching for a specific problem is further along in intent than someone who clicks an ad in their feed .they are actively looking , so not only is organic cheaper per lead the leads are actually better .so what are we doing differently next year ?three things : first we are flipping the ratio moving to roughly 50/50 between paid and organic over the next two quarters not all at once because ah still need the near-term pipeline ! second we are doubling the content team from two writers to four . and we are focusing on what we call bottom of funnel comparison content because that is where the intent and the conversion rate are highest .and third we are going to treat paid as an accelerant for content that is already ranking instead of a standalone channel so when an article hits page one or we put paid behind it to compress the timeline .the goal by end of next year is to get our blended cost per acquisition back under $30 and to have organic driving more than half of all qualified pipeline .right now it is at about 22% .that is a big gap to close ,but the trajectory over the last six months tells me it is achievable .anyway that is the short version .we can dig into the channel level breakdown in a separate session .

Superwhisper for Numbers & Structured Data Best: Whisper Standard Ultra

Whisper Standard Local

Numbers/ITN 97.5% 3 err / 40w

Numbers/ITN

Dates and currency nearly exact
"$12,400.75" and phone number correct

"March 15th, 2026" → "March 15, 2026" (minor format)

Ultra Cloud

Numbers/ITN 94.9% 2 err / 39w

Numbers/ITN

Phone number and date format correct

"$12,400.75" → "$12400.75" (comma dropped)
"Order ID" label partially dropped

S1-Voice Cloud

Numbers/ITN 92.7% 3 err / 41w

Numbers/ITN

"$12,400.75" exact
Phone number format correct

"March 15th, 2026" → "March 15, 2026"
"ABC-123456" → "ABC 123456" (hyphen dropped)

Parakeet Local

Numbers/ITN 14.0% 37 err / 43w

Numbers/ITN

No ITN — numbers output as spoken words throughout
"$12,400.75" → "twelve thousand four hundred dollars and seventy five cents"
Phone number and order ID completely garbled

Superwhisper: Noise Resistance Best: Parakeet Ultra

Parakeet Local

Noisy Cafe 100.0% 0 err / 195w

Noisy Cafe

Noise has zero effect — identical output to clean version
Café background at SNR 5 dB not detected

Whisper Standard Local

Noisy Cafe 100.0% 0 err / 190w

Noisy Cafe

Noise has zero effect — identical output to clean version

Ultra Cloud

Noisy Cafe 99.0% 4 err / 190w

Noisy Cafe

Near-perfect under café noise — only 1 minor substitution

"in-between" split once

S1-Voice Cloud

Noisy Cafe 95.2% 7 err / 188w

Noisy Cafe

Handles café noise better than casual clean — rewriting helps here

"in-between" → "in between"
Some filler words not stripped

Tested on Windows 11 26H2 · AMD Ryzen AI 9 HX 370 · 32 GB RAM · NVIDIA RTX 5070 Laptop 8 GB

Superwhisper UX & Integration

Getting started & flow

Onboarding flow

Reached first successful dictation in about a minute — nothing superfluous.

5 / 5

Hotkey customization

Default shortcut is comfortable and remappable, no system conflicts — but the push-to-talk option does not actually work.

3 / 5

Error messages

Shows a center-screen message when the trial runs out, but there is no fallback — and settings navigation is scattered across sections.

3 / 5

Recording experience

Recording overlay UX

Clear recording pill / overlay — recording state is obvious.

5 / 5

Stop / cancel UX

Easy to cancel a bad dictation; cancel hotkey included.

5 / 5

Text insertion reliability

Pastes reliably into every app tested.

5 / 5

Auto-insert vs clipboard

Auto-inserts the text and can restore your previous clipboard afterwards.

5 / 5

Managing your work

Recording history

Browsable history with search; you can open a recording to see its mode, duration and even the prompt used. No export.

4 / 5

Mode / model switching

Fast switching by hotkey and from the pill UI.

5 / 5

Idle resource use

~160 MB RAM · 0.3% CPU at rest (cloud).

2 / 5

Superwhisper Features

Text processing

AI post-processing

Cloud AI modes rewrite text — many models, BYOK for several providers. But S1-Voice over-rewrites and drops content.

Custom vocabulary / dictionary

Per-word replacements applied at transcription.

Text snippets / expansion

Bundled into the custom-dictionary feature, not a separate snippets UI — and also doable via LLM post-processing instructions.

Output & extras

File transcription

Hidden behind the tray icon. Broken on LLM modes (returns a stale buffer); only works on the Voice mode, and the UX is so confusing it barely counts.

Music auto-mute

Pause, lower, or fully mute media while recording.

Translation mode

No built-in translation mode.

Ask / Q&A mode

No Ask / Q&A LLM mode.

Export (txt / srt / json)

No txt / srt / json export, and history cannot be bulk-exported.

Voice commands

Local recognition

Offline / local inference

Genuine offline mode — no audio leaves device in local configuration.

Multiple model options

Parakeet, Whisper Standard (hidden), S1-Voice, Ultra — but best local model is buried.

Superwhisper Privacy

Superwhisper keeps audio on-device when using local models. Cloud models upload audio to modal.com only after you press Stop.

Audio uploaded in cloud mode only

Endpoints: modal.com, api.superwhisper.com

Audio sent only after you press Stop

Nothing is uploaded until you confirm by pressing Stop. Cancel before then and the audio never leaves.

Account optional

You can use the app without an account, but some features ask you to sign in.

Sends more than audio

In cloud mode: active app name, focused element text, clipboard contents, computer name, locale, timezone

Opt out of training on your data

Your recordings are not used to train models.

Disable analytics & tracking

You can turn off product analytics and telemetry.

Turn off history storage

You can set the app to never store your transcription history.

From the privacy policy not scored

Privacy policy guarantees data is never used to train AI models and is not retained on Superwhisper servers — all storage is local.
States it collects no usage data and uses no cookies or tracking technologies.
Note: the observed cloud mode still sends app context and clipboard to Modal.com — stronger than the policy implies, so cloud users should not assume "local-only".

Pricing

Trial: 15-minute free trial — then basic features stay free, no payment needed

Free Trial Basic features after the trial

Voice to text in any app
Meeting recording and transcription
Unlimited use of small AI models
100+ languages, custom prompt control
15-minute trial limit (all models)

Subscription $8.49/mo Pro · $84.99/yr (2 months free) · 40% student discount · 30-day refund

Unlimited use of cloud and local AI models
Bring your own AI API keys
Translate any language to English
Transcribe audio and video files, priority support

Lifetime $249.99one-time

All Pro features, one-time payment — no subscription
Activates across multiple devices

Superwhisper — Superwhisper pricing — Free, Pro $8.49/mo, Enterprise (as of 2026-07-09)

Superwhisper on the free tier

Superwhisper has no real free tier — only a 15-minute trial, after which a paid plan is required. It is excluded from our Best free option ranking. How we judge free tiers →

Methodology

Accuracy scores use WER (Word Error Rate) computed against multi-reference ground truth with {a|b} alternates for valid transcription variants (e.g. 48% and forty-eight percent are both accepted). Audio delivered via virtual cable from ElevenLabs TTS. Single test session on 2026-05-30.

Read the full methodology →

Limitations of this test

TTS source, not human voice — real-world WER will be higher
Single session, no variance measurement across multiple runs
Punctuation (PER) not shown in this table — see raw data
Numbers WER may be overstated for apps that apply ITN (converting spoken to digit form)