Wispr Flow Review
Cloud-only dictation with fast, polished auto-cleanup
Wispr Flow is a voice dictation app for macOS, Windows and iOS with cloud-based AI transcription. Priced at $15/month. This independent review covers WER/CER accuracy across 6 test recordings, a privacy analysis, and a UX verdict.
Wispr Flow Verdict
Fast, clean, effortless — as long as you are online and not asking about privacy
Wispr Flow version 1.5.433 scores 6.2/10 overall in VoiceTools independent testing (tested 2026-05-27). Standard achieves 3.7% aggregate WER across 6 recordings.
Works well for
- Consistently ~1s result, even on noisy café audio — no model choices to make
- Auto-cleanup genuinely works: capitalisation, punctuation and ITN land without manual editing
- Same quality on free and paid — the free tier is not a downgraded model
Watch out for
- Cloud-only: every recording uploaded (~22.8 MB / 2 min), unusable without good internet
- No export, no built-in translation, and the text-transformation flow is obscure and hard to discover
- Onboarding is long and its final mic test is broken — you cannot complete it cleanly
Best for
- People who want polished English dictation out of the box with zero model-picking — there is even a dedicated vibe-coding mode that handles variable names
Not for
- Privacy-conscious users and anyone who works offline — there is no local mode at all
Wispr Flow Accuracy & Speed
| Model | Accuracy | Speed | ||
|---|---|---|---|---|
| English | Cloud | Standard Only model Wispr Flow's single cloud model. Auto-cleanup (disfluency removal, capitalisation, punctuation, ITN) is always on and tuned by a Light/Medium/High slider — tested on the default Light setting. No model picker — one cloud model for everyone, free and paid alike | 3.7%
WER Word Error Rate
What % of words the model got wrong. 0% = every word correct.
2.4% CER Character Error Rate
Same as WER but measured letter-by-letter. Usually lower than WER.
24% PER Punctuation Error Rate
How accurately the model placed commas, periods, and other punctuation.
7 / 10 |
~1.5s
1–3s range
Post-stop latency
Seconds from pressing Stop to the final text appearing in your active app. Average across all test recordings.
9 / 10 |
| No models match — turn a filter back on. | ||||
Wispr Flow for Coding & IT Recommended: Standard
Coding
- Auto-cleanup: punctuation and capitalisation correct
- No hallucinations or dropped segments
- "last_seen_at" → "last scene at"
- "Tauri" → "Atari"
- "whisper.cpp" → "whisper. cpp" (dot split)
Conference
- Handles accented speaker reliably
- Zero dropped sentences
- ITN active: numbers and dates formatted
- "Kubernetes" → "Cubernetes"
- "gRPC" → "GRPC" (casing lost)
Wispr Flow for Everyday & Long-form Recommended: Standard
Casual
- Auto-cleanup works: punctuation and caps correct
- Natural reading — removes "um/uh" cleanly
- "re-time" → "retime" (hyphen dropped)
- Minor rewording of closing sentence
Long-form
- No drift over 3:42 — consistent quality throughout
- Zero hallucinations, zero dropped sections
- "Wispr Flow" substituted for app name once
- One sentence boundary slightly shifted
Wispr Flow for Numbers & Structured Data Recommended: Standard
Numbers/ITN
- Perfect ITN: "$12,400.75", "1-800-555-0123 ext. 479", "ABC-123456" all exact
- Date "March 15th, 2026 at 3:30 PM" formatted correctly
Wispr Flow: Noise Resistance Recommended: Standard
Noisy Cafe
- Identical output to clean version — noise has no effect
- "re-time" → "retime" (same minor artefact as clean)
Wispr Flow UX & Integration
Getting started & flow
Long onboarding whose final built-in mic test is broken — the result never shows, so you cannot finish it cleanly.
Fully customisable, and mouse buttons can be bound as triggers too.
Only seen offline — the no-internet error state is clear.
Recording experience
The recording pill is clear and well done.
Works, but the stop / cancel buttons are small.
Auto-insert works in every app tested.
Always auto-inserts — no toggle. You can add a hotkey to re-insert the last text, but there is no clipboard mode.
Managing your work
A history list exists on the home screen, but there is no search and no export.
A hotkey cycles modes, but it is never clear which mode is currently active.
~450 MB RAM · 0.3% CPU at rest (cloud).
Wispr Flow Features
Text processing
Cloud LLM rewrite — but you must select already-typed text and trigger a separate "transformation" hotkey; the flow is hard to discover.
Per-word auto-replace before insertion.
Output & extras
No txt / srt / json export, and history cannot be bulk-exported.
Local recognition
Cloud-only. Nothing works without internet.
A single cloud model. Nothing to pick — which is also part of the appeal.
Wispr Flow Privacy
Wispr Flow streams audio to inference.wisprflow.com on every recording — upload begins while you are still speaking, before you press Stop. Beyond audio: By default collects audio, transcripts and your edits; Privacy Mode is locked behind the paid plan. Also sends Sentry crash data and PostHog product analytics.
Endpoints: inference.wisprflow.com, api.wisprflow.ai, sentry.io, posthog
Recording is streamed to the server while you talk — if you cancel, it has already left your device.
By default collects audio, transcripts and your edits; Privacy Mode is locked behind the paid plan. Also sends Sentry crash data and PostHog product analytics.
You can opt out of training — the toggle lives on the paid plan, so free / trial recordings may still be used.
Analytics and tracking cannot be fully disabled (e.g. Google Analytics, ad attribution).
You can set the app to never store your transcription history.
From the privacy policy not scored
- Privacy policy: third-party LLM data is never used to train those services and is deleted after 30 days.
- Uses cookies and Google Analytics (opt-out available for analytics) and tracks ad attribution; shares data with advertising partners for tailored ads.
- Optional "Context Awareness" gathers content from your other apps; pseudonymised text/corrections are collected with consent for model improvement.
Pricing
Methodology
Accuracy scores use WER (Word Error Rate) computed against multi-reference ground truth
with {a|b} alternates for valid transcription variants (e.g. 48% and
forty-eight percent are both accepted). Audio delivered via virtual cable from
ElevenLabs TTS. Single test session on 2026-05-27.