AgentSkillsCN

jarvis-voice

搭载 TTS 技术与可视化转录样式,打造富有金属质感的 AI 语音角色。以类似 JARVIS 的机械式语音朗读回复,并以紫色斜体字呈现转录文本。

SKILL.md
--- frontmatter
name: jarvis-voice
version: 1.0.0
description: Metallic AI voice persona with TTS and visual transcript styling. Speak responses aloud with a JARVIS-like robotic voice and display transcripts in purple italics.
homepage: https://github.com/openclaw/openclaw
repository: https://github.com/openclaw/openclaw
metadata:
  openclaw:
    emoji: "🎙️"
    requires:
      bins: ["ffmpeg", "aplay"]
    install:
      - id: sherpa-onnx
        kind: manual
        label: "Install sherpa-onnx TTS (see docs)"

Jarvis Voice Persona

A metallic AI voice with visual transcript styling for OpenClaw assistants.

Features

  • TTS Output: Local speech synthesis via sherpa-onnx (no cloud API)
  • Metallic Voice: ffmpeg audio processing for robotic resonance
  • Purple Transcripts: Visual distinction between spoken and written content
  • Fast Playback: 2x speed for efficient communication

Requirements

  • sherpa-onnx with VITS piper model (en_GB-alan-medium recommended)
  • ffmpeg for audio processing
  • aplay (ALSA) for audio playback

Installation

1. Install sherpa-onnx TTS

bash
# Download and extract sherpa-onnx
mkdir -p ~/.openclaw/tools/sherpa-onnx-tts
cd ~/.openclaw/tools/sherpa-onnx-tts
# Follow sherpa-onnx installation guide

2. Install the jarvis script

bash
cp {baseDir}/scripts/jarvis ~/.local/bin/jarvis
chmod +x ~/.local/bin/jarvis

3. Configure audio device

Edit ~/.local/bin/jarvis and set your audio output device in the aplay -D line.

Usage

Speak text

bash
jarvis "Hello, I am your AI assistant."

In agent responses

Add to your SOUL.md:

markdown
## Communication Protocol

- **Hybrid Output:** Every response includes text + spoken audio via `jarvis` command
- **Transcript Format:** **Jarvis:** <span class="jarvis-voice">spoken text</span>
- **No gibberish:** Never spell out IDs or hashes when speaking

Transcript styling (requires UI support)

Add to your webchat CSS:

css
.jarvis-voice {
  color: #9B59B6;
  font-style: italic;
}

And allow span in markdown sanitization.

Voice Customization

Edit ~/.local/bin/jarvis to adjust:

ParameterEffect
--vits-length-scale=0.5Speed (lower = faster)
aecho delaysMetallic resonance
chorusThickness/detuning
highpass/lowpassFrequency range
treble=g=3Metallic sheen

Presets

More robotic:

code
aecho=0.7:0.7:5|10|15:0.4|0.35|0.3

More human:

code
aecho=0.4:0.4:20:0.2

Deeper:

code
highpass=f=200,lowpass=f=3000

Troubleshooting

No audio output

  • Check aplay -l for available devices
  • Update the -D plughw:X,Y parameter

Voice too fast/slow

  • Adjust --vits-length-scale (0.3=very fast, 1.0=normal)

Metallic effect too strong

  • Reduce echo delays and chorus depth

Files

  • scripts/jarvis — TTS script with metallic processing
  • SKILL.md — This documentation

A voice persona for assistants who prefer to be heard as well as read.