Highlights
- Fully local pipeline. Audio never leaves the device. Parakeet TDT v2 transcribes on the Apple Neural Engine; MLX Qwen3 1.7B cleans the output in-process.
- Push-to-talk that gets out of the way. Hold
Option+Space, talk, release. Text lands in whatever app you were focused on. - Vocabulary that actually learns. Add custom terms, import from open project files, or teach a single word from a recorded sample.
- License-keyed activation. Buy a key from the site, paste it into onboarding, you're licensed. Two devices per key with self-deactivation.
- Beta pricing. First 1,000 users get $1/month, $5/year, or $10 lifetime. Public pricing kicks in after the cohort fills.
What's new
Parakeet TDT v2 transcription on the Apple Neural Engine
Voxstr now uses FluidAudio's Parakeet TDT v2 model running on the ANE. Transcription is accurate enough for technical English out of the box — including code names, file paths, and common dev jargon — and fast enough that the bottleneck is your speaking pace, not the model. The first run downloads the model weights (~120 MB); every subsequent run loads from cache.
Why this matters: the previous prototype used a server-side transcription path with a half-second round trip even on a fast connection. Local + ANE inference removes the round trip and the privacy concern in one move.
MLX Qwen3 1.7B cleanup, in-process
The cleanup step — filler removal, punctuation, light formatting — now runs in-process via MLX with a quantized Qwen3 1.7B model. Earlier builds shelled out to Ollama, which meant a separate background daemon and noticeably more RAM. The MLX path uses about 1.2 GB of unified memory while active and unloads when idle.
If MLX cleanup is unavailable for any reason — a system that somehow lacks Apple Silicon support, an interrupted model download — Voxstr falls back to a regex-only cleanup pass so the pipeline never hard-fails. You'll see a small banner in the dashboard noting the fallback.
License key onboarding
New buyers get a license key by email after checkout. Onboarding's third step now accepts a key, validates it against the site's entitlement service, and stores it in the user keychain. Devices deactivate themselves cleanly from Settings > Account, so switching machines is a two-click operation rather than a support ticket.
Each key is good for two active devices. If you hit the limit, the in-app prompt lists your active devices with their friendly names and last-seen dates, and lets you free a slot in place. There is no separate web portal to remember and no waiting on a support round-trip.
Custom vocabulary and the teach surface
Voxstr now ships a Vocabulary tab in the dashboard with three ways to add terms. Type them directly. Drop a project folder onto the scanner and let it import names from open files. Or hit "Teach" and record yourself saying the word once — Voxstr stores the sample as an alias-weighted rescore target so the model picks your pronunciation over the default.
Custom vocabulary uses CTC keyword spotting on top of the main decoder, so terms get rescored after the first pass — meaning a new entry takes effect on the next dictation, with no model retraining.
Fixes
- CTC vocabulary boosting now strips trailing punctuation before matching, so quoted terms like
'hello,'get rescored correctly. - Hotkey tap recovers automatically after the system disables it on sleep/wake.
- Pasteboard fallback no longer leaves the clipboard with stale dictation when injection succeeds at the AX layer.
- Menu bar icon stays visible on macOS 14.x systems where the bar was previously over-trimmed.
Under the hood
The transcription and cleanup paths are now fully decoupled and cancellable. Releasing the hotkey early stops both stages within ~50 ms. We also moved the model warmup to a background queue so the first dictation after launch no longer pays the cold-start cost in the foreground.
Internal: 762 Swift Testing cases plus 31 XCTest cases gate every merge. Sentry crash reporting is wired but off by default; opt in from Settings > Diagnostics if you want to send anonymized traces while the beta firms up.
Upgrade notes
Fresh install — there's nothing to upgrade from. Future releases will land via Sparkle auto-update within 24 hours; you can force a check from the menu bar.
Get this version
Download Voxstr from the homepage. The beta cohort is capped at 1,000 users. After that, public pricing ($2/mo, $10/yr, $20 lifetime) applies — beta keys keep their original price for the life of the subscription.