Voice Transcription Skill

This skill enables local voice transcription using whisper.cpp for privacy-preserving speech-to-text.

When to Use This Skill

Use this skill when the user:

The transcription script now includes:

If the script detects missing installation, it will return JSON with "installation_needed": true. When you see this:

•

Offer to run installation:

code

"It looks like VoiceType isn't fully installed. Would you like me to run the installer? I can do this with: /voicetype-install"

The script automatically handles:

•✅ Checks for installation - Verifies venv, whisper binary, and scripts exist
•✅ Starts whisper server - Auto-starts from .whisper/bin/ if not running
•✅ Downloads model - First-time use downloads whisper model automatically

You don't need to manually check the server - the script does it!

•
Run the transcription script:
bash
```
source venv/bin/activate && python skills/voice/scripts/transcribe.py --duration 5
```
The script automatically:
- •✅ Checks installation (offers /voicetype-install if needed)
- •✅ Starts whisper server if not running
- •✅ Records audio from microphone for specified duration (default 5 seconds)
- •✅ Transcribes via local whisper.cpp server (localhost:2022)
- •✅ Returns JSON with transcribed text
•
Parse the output:
- •Success: {"text": "transcribed speech", "duration": 5}
- •Installation needed: {"error": "...", "installation_needed": true, "missing_components": [...], "help": [...]}
- •Transcription error: {"error": "error message", "help": [...]}
•
Handle installation_needed: If JSON contains "installation_needed": true:
- •Inform user: "VoiceType needs to be installed first."
- •Offer: "Would you like me to run the installer? Use: /voicetype-install or I can run: bash install.sh"
- •Wait for user confirmation before proceeding

User: "Let me record a voice note about the bug I'm seeing"

Assistant:

•Informs user: "I'll record for 5 seconds. Speak when ready..."
•Runs transcription script (auto-starts server if needed)
•Receives: {"text": "The submit button isn't working when I click it on the checkout page"}
•Responds: "I transcribed: 'The submit button isn't working when I click it on the checkout page.' Let me help you investigate this issue..."

User: "Record my voice"

Assistant:

•Runs transcription script
•Receives: {"error": "VoiceType is not fully installed", "installation_needed": true, "missing_components": ["Python venv", "whisper.cpp binary"]}
•Responds: "It looks like VoiceType isn't installed yet. Would you like me to run the installer? I can guide you through it with: /voicetype-install or directly run: bash install.sh"
•User confirms
•Runs /voicetype-install or bash install.sh
•After installation: "Installation complete! Now let's try voice transcription..."

The transcription script accepts optional parameters:

If transcription fails:

•

Check microphone access:

bash

python -c "import sounddevice as sd; print(sd.query_devices())"

•

Verify whisper server:

bash

systemctl --user status whisper-server
journalctl --user -u whisper-server -n 20

•

Test the script directly:

bash

cd /path/to/voicetype
source venv/bin/activate
python skills/voice/scripts/transcribe.py

All voice processing happens locally: