LaiCut
LattifAI's audio-text processing toolkit. Currently supports forced alignment, with translate and speaker diarization coming soon.
Requires LattifAI API Key - Get from https://lattifai.com/dashboard/api-keys
When to Use
- •Accurate/precise timing needed - When user requests accurate timestamps or precise alignment
- •Sync misaligned captions - Fix timing drift in downloaded captions
- •Align manual transcripts - Match text to speech precisely
- •Post-transcription alignment - Improve timing from auto-generated captions
- •Multi-format support - SRT, VTT, ASS, LRC, TXT, MD
When NOT to Use
- •Need full transcription (use
/omnicaptions:transcribe) - •No existing caption/transcript (nothing to align)
- •Very short clips (<5 seconds)
Setup
pip install "omni-captions-skills[laicut]"
API Key
Priority: LATTIFAI_API_KEY env → .env file → ~/.config/omnicaptions/config.json
If not set, ask user: Please enter your LattifAI API key (get from https://lattifai.com/dashboard/api-keys):
Then run with -k <key>. Key will be saved to config file automatically.
CLI Usage
# Basic alignment (default: JSON with word-level timing, RECOMMENDED) omnicaptions LaiCut audio.mp3 caption.srt # → caption_LaiCut.json # Then convert to desired format (preserves word timing in JSON for future use) omnicaptions convert caption_LaiCut.json -o caption_LaiCut.srt # Smart sentence segmentation (for word-level captions like YouTube VTT) omnicaptions LaiCut video.mp4 caption.vtt --split-sentence
| Option | Description |
|---|---|
-o, --output | Output file (default: <caption>_LaiCut.json) |
-f, --format | Output format (default: json) |
-k, --api-key | LattifAI API key |
--split-sentence | AI-powered semantic sentence segmentation |
-v, --verbose | Show progress |
JSON Output (Recommended)
JSON output preserves word-level timing for downstream tasks:
[
{
"text": "Hello world",
"start": 0.0,
"end": 2.5,
"words": [
{"word": "Hello", "start": 0.0, "end": 0.5},
{"word": "world", "start": 0.6, "end": 2.5}
]
}
]
Convert JSON to other formats:
# For playback/editing omnicaptions convert video.en_LaiCut.json -o video.en_LaiCut.srt # For bilingual ASS omnicaptions convert video.en_LaiCut.json -o video.en_LaiCut.ass --style bilingual # For karaoke omnicaptions convert lyrics_LaiCut.json -o lyrics_LaiCut_karaoke.ass --karaoke
For translation: Convert to SRT first (JSON is too large for Claude to read):
# 1. JSON → SRT omnicaptions convert video.en_LaiCut.json -o video.en_LaiCut.srt # 2. Claude 翻译 → video.en_LaiCut_Claude_zh.srt # 3. 转换为带颜色 ASS omnicaptions convert video.en_LaiCut_Claude_zh.srt -o video.en_LaiCut_Claude_zh_Color.ass \ --line1-color "#00FF00" --line2-color "#FFFF00"
Ask User: Enable Smart Sentence Segmentation?
When input captions are word-level or poorly segmented (e.g., YouTube VTT), ask user whether to enable --split-sentence:
Is the caption word-level or poorly segmented (e.g., YouTube VTT)? - Yes → Add --split-sentence (AI re-segments into natural sentences) - No → Keep original segmentation
Use cases: YouTube VTT, word-aligned captions, messy auto-generated captions
LattifAI API Key Error Handling
LaiCut requires LattifAI API key (NOT Gemini API key)
Error example:
API KEY verification error: API KEY is invalid or expired.
Correct handling:
- •Tell user that LaiCut requires a valid LattifAI API key
- •Ask user if they want to provide a new key
- •Direct user to https://lattifai.com/dashboard/api-keys
- •NEVER skip alignment step claiming "timing is already accurate enough"
Common Mistakes
| Mistake | Fix |
|---|---|
| API key invalid/expired | Ask user for new key, do NOT skip |
| No API key | Set LATTIFAI_API_KEY or use -k |
| Audio format error | Convert to WAV/MP3/M4A first |
| Empty output | Check caption has text content |
Related Skills
| Skill | Use When |
|---|---|
/omnicaptions:transcribe | Generate transcript first |
/omnicaptions:convert | Convert caption formats |
/omnicaptions:translate | Translate after alignment |
Workflow Examples
Important:
- •LaiCut outputs JSON by default (preserves word-level timing)
- •Convert to SRT/ASS when needed for playback or translation
- •Generate bilingual captions AFTER alignment
# No caption: transcribe → align (JSON) → convert → translate omnicaptions transcribe video.mp4 # → video_GeminiUnd.md omnicaptions LaiCut video.mp4 video_GeminiUnd.md # → video_GeminiUnd_LaiCut.json omnicaptions convert video_GeminiUnd_LaiCut.json -o video_GeminiUnd_LaiCut.srt # → video_GeminiUnd_LaiCut_Claude_zh.srt (after translate) # Has caption: download → align (JSON) → convert → translate omnicaptions download "https://youtu.be/xxx" # → xxx.en.vtt omnicaptions LaiCut xxx.mp4 xxx.en.vtt # → xxx.en_LaiCut.json omnicaptions convert xxx.en_LaiCut.json -o xxx.en_LaiCut.srt # → xxx.en_LaiCut_Claude_zh.srt (after translate) # → xxx.en_LaiCut_Claude_zh_Color.ass (after convert with --style)