AgentSkillsCN

local-whisper

使用 OpenAI Whisper 实现本地语音转文字。模型下载后可完全离线运行,支持多种模型尺寸,提供高质量的转录效果。

SKILL.md
--- frontmatter
name: local-whisper
description: Local speech-to-text using OpenAI Whisper. Runs fully offline after model download. High quality transcription with multiple model sizes.
metadata: {"thinkfleetbot":{"emoji":"🎙️","requires":{"bins":["ffmpeg"]}}}

Local Whisper STT

Local speech-to-text using OpenAI's Whisper. Fully offline after initial model download.

Usage

bash
# Basic
~/.thinkfleetbot/skills/local-whisper/scripts/local-whisper audio.wav

# Better model
~/.thinkfleetbot/skills/local-whisper/scripts/local-whisper audio.wav --model turbo

# With timestamps
~/.thinkfleetbot/skills/local-whisper/scripts/local-whisper audio.wav --timestamps --json

Models

ModelSizeNotes
tiny39MFastest
base74MDefault
small244MGood balance
turbo809MBest speed/quality
large-v31.5GBMaximum accuracy

Options

  • --model/-m — Model size (default: base)
  • --language/-l — Language code (auto-detect if omitted)
  • --timestamps/-t — Include word timestamps
  • --json/-j — JSON output
  • --quiet/-q — Suppress progress

Setup

Uses uv-managed venv at .venv/. To reinstall:

bash
cd ~/.thinkfleetbot/skills/local-whisper
uv venv .venv --python 3.12
uv pip install --python .venv/bin/python click openai-whisper torch --index-url https://download.pytorch.org/whl/cpu