Voice Command Listener
This skill allows you to control the Gemini Ecosystem with your voice.
Capabilities
1. Voice Recording
- •Uses
sox(Sound eXchange) to capture audio from the default microphone. - •Records until silence is detected or a specific key is pressed (Ctrl+C).
2. Transcription (Whisper)
- •Sends the recorded
.wavor.mp3file to the OpenAI Whisper API. - •Returns the transcribed text, ready to be piped into other skills.
Usage
- •"Listen to my command." (Starts recording)
- •"Transcribe this audio file and execute it as a prompt."
Prerequisites
- •
soxinstalled (brew install sox). - •OpenAI API Key configured in
knowledge/personal/voice/config.json.
Knowledge Protocol
- •Adheres to
knowledge/tech-stack/macos/voice_control_guide.md.