Telegram voice-to-voice (macOS)
Requirements
- •macOS Tahoe on Apple Silicon (Macintosh Silicon).
- •
yapCLI available inPATH(Speech.framework transcription).- •Project: https://github.com/finnvoor/yap (by finnvoor)
- •
ffmpegavailable inPATH.
Persistent reply mode (voice vs text)
Store a small per-user preference file in the workspace:
- •State file:
voice_state/telegram.json - •Key: Telegram sender user id (string)
- •Values:
- •
"voice"(default): reply with a Telegram voice note - •
"text": reply with a single text message
- •
If the file does not exist or the sender id is missing: assume "voice".
Toggle commands
If an inbound text message is exactly:
- •
/audio off→ set state to"text"and confirm with a short text reply. - •
/audio on→ set state to"voice"and confirm with a short text reply.
Getting the inbound audio (.ogg)
Telegram voice notes often show up as <media:audio> in message text.
OpenClaw saves the attachment to disk (typically .ogg) under:
- •
~/.openclaw/media/inbound/
Recommended approach:
- •If the inbound message context includes an attachment path, use it.
- •Otherwise, take the most recent
*.oggfrom~/.openclaw/media/inbound/.
Transcription
Default locale: en-US.
Preferred:
- •
yap transcribe --locale "${YAP_LOCALE:-en-US}" <path.ogg>
If transcription fails or is empty: ask the user to repeat or send text.
Helper script:
- •
scripts/transcribe_telegram_ogg.sh [path.ogg]
Reply behavior
Mode: voice (default)
- •Generate the reply text.
- •Convert reply text to an OGG/Opus voice note using:
- •
scripts/tts_telegram_voice.sh "<reply text>" [SYSTEM|VoiceName]
The script prints the generated .ogg path to stdout.
- •Send the
.oggback to Telegram as a voice note (not a generic audio file):
- •use the
messagetool withasVoice: trueandmedia: <path.ogg> - •optionally set
replyToto thread the response
Notes:
- •Use
SYSTEMto rely on the current macOS system voice (recommended).
Mode: text
Reply with a single text message:
- •
Transcription: <...> - •
Reply: <...>