First-Time Setup
If VoiceMode isn't working or MCP fails to connect, run:
/voicemode:install
After install, reconnect MCP: /mcp → select voicemode → "Reconnect" (or restart Claude Code).
VoiceMode
Natural voice conversations with Claude Code using speech-to-text (STT) and text-to-speech (TTS).
Note: The Python package is voice-mode (hyphen), but the CLI command is voicemode (no hyphen).
When to Use MCP vs CLI
| Task | Use | Why |
|---|---|---|
| Voice conversations | MCP voicemode:converse | Faster - server already running |
| Service start/stop | MCP voicemode:service | Works within Claude Code |
| Installation | CLI voice-mode-install | One-time setup |
| Configuration | CLI voicemode config | Edit settings directly |
| Diagnostics | CLI voicemode diag | Administrative tasks |
Usage
Use the converse MCP tool to speak to users and hear their responses:
# Speak and listen for response (most common usage)
voicemode:converse("Hello! What would you like to work on?")
# Speak without waiting (for narration while working)
voicemode:converse("Searching the codebase now...", wait_for_response=False)
For most conversations, just pass your message - defaults handle everything else.
| Parameter | Default | Description |
|---|---|---|
message | required | Text to speak |
wait_for_response | true | Listen after speaking |
voice | auto | TTS voice |
For all parameters, see Converse Parameters.
Best Practices
- •Narrate without waiting - Use
wait_for_response=Falsewhen announcing actions - •One question at a time - Don't bundle multiple questions in voice mode
- •Check status first - Verify services are running before starting conversations
- •Let VoiceMode auto-select - Don't hardcode providers unless user has preference
- •First run is slow - Model downloads happen on first start (2-5 min), then instant
Handling Pauses and Wait Requests
When the user asks you to wait or give them time:
Short pauses (up to 60 seconds): If the user says something ending with "wait" (e.g., "hang on", "give me a sec", "wait"), VoiceMode automatically pauses for 60 seconds then resumes listening. This is built-in.
Longer pauses (2+ minutes): Use bash sleep N where N is seconds. For example, if the user says "give me 5 minutes":
sleep 300 # Wait 5 minutes
Then call converse again when the wait is over:
voicemode:converse("Five minutes is up. Ready when you are.")
Configuration: The short pause duration is configurable via VOICEMODE_WAIT_DURATION (default: 60 seconds).
Check Status
voicemode service status # All services voicemode service status whisper # Specific service
Shows service status including running state, ports, and health.
Installation
# Install VoiceMode CLI and configure services uvx voice-mode-install --yes # Install local services (Apple Silicon recommended) voicemode service install whisper voicemode service install kokoro
See Getting Started for detailed steps.
Service Management
# Start/stop services
voicemode:service("whisper", "start")
voicemode:service("kokoro", "start")
# View logs for troubleshooting
voicemode:service("whisper", "logs", lines=50)
| Service | Port | Purpose |
|---|---|---|
| whisper | 2022 | Speech-to-text |
| kokoro | 8880 | Text-to-speech |
| voicemode | 8765 | HTTP/SSE server |
Actions: status, start, stop, restart, logs, enable, disable
Configuration
voicemode config list # Show all settings voicemode config set VOICEMODE_TTS_VOICE nova # Set default voice voicemode config edit # Edit config file
Config file: ~/.voicemode/voicemode.env
See Configuration Guide for all options.
DJ Mode
Background music during VoiceMode sessions with track-level control.
# Core playback voicemode dj play /path/to/music.mp3 # Play a file or URL voicemode dj status # What's playing voicemode dj pause # Pause playback voicemode dj resume # Resume playback voicemode dj stop # Stop playback # Navigation and volume voicemode dj next # Skip to next chapter voicemode dj prev # Go to previous chapter voicemode dj volume 30 # Set volume to 30% # Music For Programming voicemode dj mfp list # List available episodes voicemode dj mfp play 49 # Play episode 49 voicemode dj mfp sync # Convert CUE files to chapters # Music library voicemode dj find "daft punk" # Search library voicemode dj library scan # Index ~/Audio/music voicemode dj library stats # Show library info # Play history and favorites voicemode dj history # Show recent plays voicemode dj favorite # Toggle favorite on current track
Configuration: Set VOICEMODE_DJ_VOLUME in ~/.voicemode/voicemode.env to customize startup volume (default: 50%).
CLI Cheat Sheet
# Service management voicemode service status # All services voicemode service start whisper # Start a service voicemode service logs kokoro # View logs # Diagnostics voicemode deps # Check dependencies voicemode diag info # System info voicemode diag devices # Audio devices # History search voicemode history search "keyword" voicemode history play <exchange_id> # DJ Mode voicemode dj play <file|url> # Start playback voicemode dj status # What's playing voicemode dj next/prev # Navigate chapters voicemode dj stop # Stop playback voicemode dj mfp play 49 # Music For Programming
Voice Handoff Between Agents
Transfer voice conversations between Claude Code agents for multi-agent workflows.
Use cases:
- •Personal assistant routing to project-specific foremen
- •Foremen delegating to workers for focused tasks
- •Returning control when work is complete
Quick Reference
# 1. Announce the transfer
voicemode:converse("Transferring you to a project agent.", wait_for_response=False)
# 2. Spawn with voice instructions (mechanism depends on your setup)
spawn_agent(path="/path", prompt="Load voicemode skill, use converse to greet user")
# 3. Go quiet - let new agent take over
Hand-back:
voicemode:converse("Transferring you back to the assistant.", wait_for_response=False)
# Stop conversing, exit or go idle
Key Principles
- •Announce transfers: Always tell the user before transferring
- •One speaker: Only one agent should use converse at a time
- •Distinct voices: Different voices make handoffs audible
- •Provide context: Tell receiving agent why user is being transferred
Detailed Documentation
See Call Routing for comprehensive guides:
- •Handoff Pattern - Complete hand-off and hand-back process
- •Voice Proxy - Relay pattern for agents without voice
- •Call Routing Overview - All routing patterns
Documentation Index
| Topic | Link |
|---|---|
| Converse Parameters | All Parameters |
| Installation | Getting Started |
| Configuration | Configuration Guide |
| Claude Code Plugin | Plugin Guide |
| Whisper STT | Whisper Setup |
| Kokoro TTS | Kokoro Setup |
| Pronunciation | Pronunciation Guide |
| Troubleshooting | Troubleshooting |
| CLI Reference | CLI Docs |
| DJ Mode | Background Music |
Related Skills
- •VoiceMode Connect - Remote voice via mobile/web clients (no local STT/TTS needed)