YouTube Transcribe
When to Use This Skill
Use this skill when:
- •User requests YouTube video transcript or captions
- •User wants to extract subtitles from YouTube
- •User needs transcript content from a video
Overview
This skill guides downloading YouTube transcripts using yt-dlp and formatting them as structured HTML.
Core principle: Create HTML transcript with title, chapter headings (using video chapters if available), and paragraph content. Abort cleanly if transcript unavailable.
Output Format
Create one file: transcript.html - Structured HTML with:
- •
<h1>title from video - •
<h2>chapter headings with start timestamp only (format:MM:SS - Title) - •
<p>paragraph content under each chapter
Chapter sources (in priority order):
- •Video chapters from description (if available) - use these timestamps and titles
- •Auto-generated chapters - create 1-3 minute sections with descriptive titles
Quick Reference
| Task | Command | Notes |
|---|---|---|
| Check transcript availability | yt-dlp --list-subs URL | Run first to verify |
| Get video metadata + chapters | yt-dlp --print "%(title)s|%(chapters)s" URL | Chapters may be null |
| Download transcript | yt-dlp --write-auto-sub --sub-lang en --sub-format vtt --skip-download URL | VTT has timestamps |
| Download manual subs (fallback) | yt-dlp --write-sub --sub-lang en --sub-format vtt --skip-download URL | If auto unavailable |
Implementation Steps
Step 1: Check Transcript Availability
yt-dlp --list-subs VIDEO_URL
If no subtitles available:
- •✅ Abort with message: "This video does not have transcripts available."
- •❌ Do NOT attempt speech-to-text, scraping, or workarounds
Step 2: Get Video Metadata and Chapters
yt-dlp --print "%(title)s|%(description)s" VIDEO_URL
Check description for chapter markers:
- •Chapter format:
MM:SS TitleorHH:MM:SS Titleon separate lines - •Example:
code
0:00 Introduction 2:30 Main Topic 5:15 Conclusion
If chapters exist in description:
- •✅ Use these timestamps and titles for H2 headings
- •✅ Format:
MM:SS - Title(start timestamp and title only)
If no chapters in description:
- •Create chapters automatically (see Step 4)
Step 3: Download Transcript
# Try auto-generated first yt-dlp --write-auto-sub --sub-lang en --sub-format vtt --skip-download VIDEO_URL # If auto unavailable, try manual yt-dlp --write-sub --sub-lang en --sub-format vtt --skip-download VIDEO_URL
Step 4: Process Transcript
Parse VTT file:
- •Remove VTT headers and metadata
- •Remove inline timing tags:
<00:00:00.000>,<c>, etc. - •Extract timestamps from cue lines:
00:00:00.000 --> 00:00:00.000 - •Clean and deduplicate text
If using video chapters (from description):
- •Group transcript text by chapter timestamp ranges
- •Use chapter titles from description
- •Format H2:
MM:SS - Chapter Title
If creating auto-generated chapters:
- •Create 1-3 minute sections
- •Generate descriptive titles from content (3-8 words, max 60 chars)
- •Format H2:
MM:SS - Generated Title
Step 5: Create HTML File
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>Video Title Here</title>
</head>
<body>
<h1>Video Title Here</h1>
<h2>0:00 - Introduction</h2>
<p>First paragraph of this chapter...</p>
<p>Second paragraph of this chapter...</p>
<h2>2:30 - Main Topic</h2>
<p>Content for this section...</p>
</body>
</html>
Requirements:
- •✅ UTF-8 charset declaration
- •✅
<h1>with video title - •✅
<h2>with formatMM:SS - Title - •✅ Paragraphs grouped under chapters
- •✅ Clean semantic HTML
Step 6: Clean Up
Remove temporary VTT file after processing.
Common Mistakes
Mistake 1: Ignoring Video Chapters
Wrong: Always creating auto-generated 1-3 minute chapters even when video has chapter markers in description.
Right:
- •Check description for chapter timestamps first
- •If found, use those chapters and titles
- •Only generate chapters if none exist in description
Mistake 2: Missing Structure Elements
Wrong (no chapters):
<body>
<h1>Video Title</h1>
<p>All transcript text...</p>
</body>
Right:
<body>
<h1>Video Title</h1>
<h2>0:00 - Introduction</h2>
<p>Transcript text for this chapter...</p>
<h2>2:30 - Next Topic</h2>
<p>More content...</p>
</body>
Mistake 3: Wrong Timestamp Format
Wrong:
<h2>Introduction</h2> <h2>0:00 Introduction</h2> <h2>00:00:00.000 - Introduction</h2> <h2>0:00 - 2:30 - Introduction</h2>
Right:
<h2>0:00 - Introduction</h2>
Mistake 4: Using Wrong Tool
Wrong:
webfetch https://www.youtube.com/watch?v=...
Right:
yt-dlp --write-auto-sub --sub-format vtt --skip-download URL
Mistake 5: Poor Auto-Generated Chapter Titles
Wrong (using first sentence verbatim):
<h2>0:00 - Can you break down the difference between capitalist, socialist, commu</h2>
Right (concise summary):
<h2>0:00 - Economic Systems Overview</h2>
Guidelines for auto-generated titles:
- •Summarize topic, don't copy first sentence
- •3-8 words, max 60 characters
- •Be descriptive and informative
Red Flags Checklist
If you catch yourself thinking:
- • "Should I skip checking for video chapters?" → NO. Always check description first.
- • "Should I skip the title?" → NO. Always include
<h1>with video title. - • "Should I skip chapter headings?" → NO. Always create
<h2>chapters. - • "Let me try WebFetch" → STOP. Use yt-dlp.
- • "No transcript, let me try speech-to-text" → STOP. Abort cleanly.
- • "Timestamps optional in chapter headings?" → NO. Always
MM:SS - Title. - • "Should I create markdown?" → NO. Always HTML.
Workflow Summary
1. User requests YouTube transcript
↓
2. Check transcript availability (yt-dlp --list-subs)
↓
3a. If unavailable → Abort with message
3b. If available → Continue
↓
4. Get video title and description
↓
5. Check description for chapter markers
↓
6a. Chapters found → Use those timestamps/titles
6b. No chapters → Generate 1-3 min sections
↓
7. Download transcript (VTT format)
↓
8. Process VTT: remove tags, extract timestamps, clean text
↓
9. Group content by chapters
↓
10. Create transcript.html:
- <h1> with title
- <h2> chapters with MM:SS - Title
- <p> paragraphs under each chapter
↓
11. Clean up temporary files
Complete Example
Video with chapters in description:
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>Understanding Economic Systems</title>
</head>
<body>
<h1>Understanding Economic Systems</h1>
<h2>0:00 - Introduction</h2>
<p>Welcome to this video on economic systems. Today we'll explore the key differences...</p>
<h2>2:30 - Capitalism Explained</h2>
<p>Capitalism is a system where capital goes to those who can generate the best returns...</p>
<h2>5:45 - Socialist Approaches</h2>
<p>In socialist systems, the government plays a larger role in economic decisions...</p>
<h2>8:20 - Conclusion</h2>
<p>Each system has tradeoffs. Understanding these helps us make informed decisions...</p>
</body>
</html>
Key features:
- •UTF-8 charset for proper character encoding
- •H1 with video title
- •H2 chapters with
MM:SS - Titleformat (from video's chapter markers) - •Paragraphs grouped logically under chapters
- •Clean, semantic HTML structure
Additional Resources
- •yt-dlp documentation: https://github.com/yt-dlp/yt-dlp
- •VTT format specification: https://www.w3.org/TR/webvtt1/