/voice-chat

Voice-controlled Claude Code. Speak your commands/questions, Claude processes them verbatim (as if you typed them), and responds with Inworld Olivia voice.

Usage

code

/voice-chat               # Start voice-controlled mode
/voice-chat stop          # Exit voice mode

Requirements (from .claude/voice-chat-requirements.md)

R1: Voice Input → Claude Code Verbatim

Your spoken words are transcribed and passed directly to Claude Code — exactly as if you typed them.

R2: Response in Inworld Olivia Voice

Claude Code's response is spoken using Inworld TTS with Olivia voice.

Prerequisites

•

voice-bridge server running with INWORLD_API_KEY set:

bash

cd ~/Documents/GitHub/voice-bridge
VOICE_BACKEND=inworld INWORLD_API_KEY=<your-key> npm start

•
Microphone permissions granted

Instructions

Starting Voice Mode

When /voice-chat is invoked:

•

Announce start:

code

mcp__voice-mcp__voice_speak({
  text: "Voice mode active. Speak your command.",
  voice_id: "Olivia",
  speed: 1.0
})

•
Display: 🎤 **Listening...**

•

Listen for voice:

code

mcp__voice-mcp__voice_listen({
  timeout_seconds: 30,
  language: "en-GB"
})

•
Display the transcription: 👤 **You said:** "<transcribed text>"
•
CRITICAL: Process the transcribed text as if the user typed it directly.
- •Do NOT interpret it as a "voice chat question"
- •Execute it as a normal Claude Code command/request
- •Use all available tools (Read, Edit, Grep, Bash, etc.) as needed
- •Generate your normal response
•
Display speaking indicator: 🔊 **Speaking...**

•

Speak the response (or summary if long):

code

mcp__voice-mcp__voice_speak({
  text: "<response or summary>",
  voice_id: "Olivia",
  speed: 1.0,
  wait: true
})

•

Smart Output Router - Decide what to speak:

Content Type	Action	Voice Says
Short (< 100 words)	Speak fully	Full response
Medium (100-300 words)	Display + announce	"Here's what I found"
Long (> 300 words)	Display only	"Please read this"
Contains code	Display + announce	"Here's the code"
Contains table	Display + announce	"Here's a table"
Error	Speak	Brief error description

•
Loop back to step 2 for next command

Exit Conditions

•User says: "stop", "exit", "bye", "goodbye", "end voice mode"
•User types: /voice-chat stop
•Timeout with no input

Stopping Voice Mode

code

mcp__voice-mcp__voice_speak({
  text: "Voice mode ended.",
  voice_id: "Olivia",
  speed: 1.0
})

Display: 👋 **Voice mode ended.** Returning to text input.

Example Flow

code

User: /voice-chat

🎤 **Listening...**
Olivia: "Voice mode active. Speak your command."

[User speaks: "What projects am I working on?"]

👤 **You said:** "What projects am I working on?"

[Claude searches vault, finds projects]

🔊 **Speaking...**

🤖 **Claude:** You have 3 active projects: Alpha, Beta, and Gamma.

Olivia: "You have 3 active projects: Alpha, Beta, and Gamma."

🎤 **Listening...**

[User speaks: "Create a meeting note for today"]

👤 **You said:** "Create a meeting note for today"

[Claude creates meeting note using normal workflow]

🔊 **Speaking...**

🤖 **Claude:** Created Meeting - 2026-01-29 [Title].md

Olivia: "Meeting note created."

🎤 **Listening...**

[User speaks: "Stop"]

Olivia: "Voice mode ended."

👋 **Voice mode ended.** Returning to text input.

Key Principle

The transcribed text IS the user's command. Process it exactly as you would if they typed it. The only difference is:

•Input comes from voice transcription
•Output is spoken via Inworld Olivia TTS

Troubleshooting

•Male voice playing: Check INWORLD_API_KEY is set and voice-bridge restarted
•No voice at all: Check voice-bridge is running on port 4000
•TTS error: Verify API key is valid base64-encoded Inworld key