Skip to content
  • LilySpeech, works better than Windows Speech Recognition (buggy)

  • WhisperFlow

    • $12/mn but uses AI to automatically format punctuation and more
  • Whisper.CPP Server - NOT an effective solution

    • delay was too slow
    • runs locally (offline), for any Windows or WSL app that accepts keyboard input
    • Hotkey: Alt+Space — press once to start recording, press again to stop; transcription is pasted at cursor
    • Installation
      • Script location: D:\FSS\Software\Utils\whisper-hotkey\whisper-hotkey.py
      • Silent launcher: D:\FSS\Software\Utils\whisper-hotkey\whisper-hotkey-launch.vbs
      • Whisper server (runs in WSL, auto-starts): installed via uvx voice-mode — reinstall with uvx voice-mode whisper install — no backup needed
      • Windows Python dependencies (one-time): cd D:\FSS\Software\Utils\whisper-hotkey then uv sync — installs all dependencies into .venv
      • Microphone privacy check: Settings → Privacy → Microphone → ensure “Allow apps to access your microphone” is ON. After changing, restart WSL: run wsl --shutdown in PowerShell.
      • Auto-start at Windows login: A shortcut already exists in the Startup folder (whisper-hotkey.lnk) pointing to the launcher — nothing to copy. Verify: Win+R → shell:startup
    • Usage
      • Press Alt+Space to start recording (script must be running)
      • Speak naturally — punctuation is added automatically
      • Press Alt+Space again to stop — text is pasted at cursor via Ctrl+V
      • Works in: Windows Terminal, Obsidian, browsers, Word, Claude Code — any app accepting keyboard input
      • There is a short delay after stopping before text appears (transcription time)
    • Notes
      • Output uses clipboard paste (Ctrl+V), not keystroke injection — this ensures compatibility with Electron apps (Obsidian, VS Code)
      • The Whisper server runs in WSL at localhost:2022. If it’s not reachable, open WSL and run uvx voice-mode whisper start
      • The base Whisper model is used by default (fast, ~223MB RAM). For better accuracy at the cost of speed, change to small or medium in ~/.voicemode/voicemode.envVOICEMODE_WHISPER_MODEL=small
      • Also used by VoiceMode MCP in Claude Code for two-way voice conversations (same local server)