VTT Transcript Reducer
This skill converts verbose WebVTT transcript files into clean, consolidated Markdown format suitable for use as context in agentic processes.
When to Use
Use this skill when:
- •You have a WebVTT (.vtt) transcript file from a meeting recording
- •The transcript needs to be used as input for document creation or updates
- •You need a clean, readable version of the conversation without technical VTT metadata
- •The file is too large or verbose to use directly in LLM context windows
What It Does
The skill processes VTT files to:
- •Remove redundant metadata: Strips cue identifiers, eliminates unnecessary markup
- •Consolidate speakers: Merges consecutive entries from the same speaker into single blocks
- •Preserve timestamps: Includes timestamps at speaker changes for temporal reference
- •Output clean Markdown: Generates a readable format optimized for agentic processing
Input/Output
Input: WebVTT (.vtt) transcript file with speaker voice tags Output: Markdown (.md) file with consolidated speaker dialogue and timestamps
How to Use
Basic Usage
To reduce a VTT file in the same directory:
python scripts/vtt-reducer.py <path-to-file.vtt>
The reduced transcript will be saved to <filename>_reduced.md in the same directory as the input file.
Custom Output Path
To specify a custom output location:
python scripts/vtt-reducer.py <path-to-file.vtt> <output-path.md>
Example
# Reduce a meeting transcript from the workspace root python scripts/vtt-reducer.py ../../.context/transcript.vtt # Output will be saved to: ../../.context/transcript_reduced.md
Note: All paths in the examples above are relative to the skill directory. When executing from the workspace root, use the full path: python {skills-location}/vtt-reducer/scripts/vtt-reducer.py <path>.
Output Format
The reduced transcript uses this format:
# Meeting Transcript **[00:00:03] Speaker Name:** Full consolidated text from this speaker including all consecutive entries merged together... **[00:05:42] Another Speaker:** Their consolidated dialogue... **[00:08:15] Speaker Name:** Continuing the conversation...
Skill Workflow
When using this skill:
- •Execute the parser script using the command above with the VTT file path
- •Wait for completion - the script will display progress and output location
- •Use the reduced file - the
_reduced.mdfile is now ready for use in other agentic tasks - •Return the file path to the user for further processing
Important Notes
- •Do not read the VTT file content - it may be very large. Execute the script directly.
- •The script handles file reading and parsing efficiently
- •Timestamps are preserved at each speaker change to maintain temporal context
- •All VTT markup and cue identifiers are removed for clean output
- •The output is optimized for LLM consumption with minimal tokens
Error Handling
The script validates:
- •Input file exists
- •VTT format is valid
- •Output directory is writable
Common errors and solutions:
- •File not found: Check the file path is correct relative to current directory
- •Permission denied: Ensure write permissions in the output directory
- •Invalid VTT format: Verify the file is a valid WebVTT transcript with speaker voice tags
Technical Details
The parser:
- •Uses regex to extract timestamps, speakers, and text from VTT cues
- •Consolidates consecutive speaker entries to reduce redundancy
- •Preserves chronological order
- •Handles speaker name variations consistently
- •Outputs UTF-8 encoded Markdown for universal compatibility