AgentSkillsCN

speak-response

利用本地 Qwen3-TTS 为 Claude 的最后响应赋予声音。默认语音为“先知”(深沉而富有共鸣的《沙丘》旁白)。可通过 --preset 参数切换情绪可控的预设音色。

SKILL.md
--- frontmatter
name: speak-response
description: Vocalize Claude's last response using local Qwen3-TTS. Default voice is the Oracle (deep, resonant Dune narrator). Use --preset for emotion-controlled preset speakers.
argument-hint: [sentences|"text"|--preset mood:warm]
disable-model-invocation: true

Speak Response

Vocalize text using local Qwen3-TTS. Default voice is the Oracle (cloned from a Dune narrator with deep, resonant, prophetic quality).

Quick Examples

CommandEffect
/speakLast 2 sentences with Oracle voice
/speak 5Last 5 sentences with Oracle voice
/speak "The sleeper must awaken."Specific text with Oracle voice
/speak --preset mood:warmLast 2 sentences with preset speaker + emotion
/speak --preset "Hello" speaker:Vivian voice:"nurturing"Preset speaker with custom voice

Default: Oracle Voice

The oracle voice is a deep, resonant, prophetic voice cloned from a Dune narrator. It speaks all text with a sense of ancient wisdom and gravitas.

bash
# Default usage - Oracle voice
scripts/speak.sh "The spice must flow."
scripts/speak.sh "He who controls the spice controls the universe."

Limitation

The Oracle uses voice cloning (Base model), which does not support per-message instruction control. The voice characteristics are fixed. For emotion/mood control, use --preset.

Preset Speakers (--preset)

For emotion and mood control, use --preset to switch to CustomVoice with adjustable instructions:

bash
scripts/speak.sh --preset "<text>" [speaker] [instruction]

Quick Preset Examples

bash
# Calm therapeutic voice
scripts/speak.sh --preset "Take a deep breath." Vivian "calm, nurturing, gentle pace"

# Excited announcement
scripts/speak.sh --preset "We did it!" Ryan "joyful, excited, enthusiastic"

# Serious explanation
scripts/speak.sh --preset "This is important." Eric "serious, measured, emphatic"

Custom Voice Instructions

The model understands rich natural language descriptions:

AspectExamples
Emotionjoyful, melancholic, anxious, calm, excited, contemplative
Paceslow and deliberate, rapid and energetic, measured, hesitant
Intensitysoft and gentle, loud and commanding, whispered, emphatic
Stylewarm and nurturing, professional, playful, dramatic
Prosodywith dramatic pauses, rising intonation, emphatic on key words

Mood Presets (Shortcuts)

PresetExpands To
calm"calm, soothing, gentle pace"
warm"warm, empathetic, nurturing tone"
excited"joyful, excited, enthusiastic"
serious"serious, measured, authoritative"
gentle"soft, gentle, whispered"
encouraging"encouraging, uplifting, sincere"
contemplative"thoughtful, slow pace, reflective"

Speakers

SpeakerBest For
Ryan (default)Professional, serious, authoritative
VivianWarm, nurturing, therapeutic
SerenaCalm, gentle, contemplative
DylanFriendly, casual, playful
EricSerious, dramatic, commanding
AidenEncouraging, uplifting, energetic
Uncle_FuWise, measured
Ono_AnnaSoft, gentle
SoheeClear, professional

Workflow

  1. Parse arguments for text and mode (default oracle vs --preset)
  2. Extract text from last response if not provided
  3. Default mode: Clone with Oracle voice
  4. Preset mode: Generate with CustomVoice + instruction
  5. Audio plays through macOS speakers

Execution

bash
# Oracle voice (default)
scripts/speak.sh "<text>"

# Preset speaker with instruction
scripts/speak.sh --preset "<text>" [speaker] [instruction]

Voice Cloning (Custom Voices)

Clone any voice from a 3+ second audio sample:

bash
# Get transcript first (use Whisper API)
curl -s https://api.openai.com/v1/audio/transcriptions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -F file="@reference.mp3" -F model="whisper-1"

# Clone the voice
scripts/clone.sh "<text to speak>" "<audio_file>" "<transcript>"

Voice Design (Create New Voices)

Design entirely new voices from natural language descriptions:

bash
scripts/design-voice.sh "<sample_text>" "<voice_description>"

# Example: Create a warm guide voice
scripts/design-voice.sh \
  "Take a deep breath and feel this moment." \
  "warm, nurturing, gentle pace, empathetic, female"

Then clone the designed voice for reuse:

bash
scripts/clone.sh "New text" designed-voice.wav "Original sample text"

See references/moods.md for more instruction examples.