Video Analysis Skill
This skill analyzes video files by extracting strategic frames using FFmpeg and providing comprehensive visual feedback using Claude's vision capabilities.
When to Use This Skill
Trigger this skill when the user wants to:
- •Get feedback on video quality, aesthetics, or content
- •Review UI/UX elements in a video
- •Analyze visual consistency and narrative flow
- •Check for technical issues or artifacts
- •Get actionable suggestions for video improvement
Core Workflow
1. Parse user request → Extract video path and parameters 2. Validate prerequisites → Check FFmpeg, video file exists 3. Analyze video metadata → Get duration, fps, resolution using ffprobe 4. Calculate frame extraction strategy → Based on mode (quick/standard/detailed/custom) 5. Extract frames → Use FFmpeg to save frames as PNG images 6. Analyze frames sequentially → Use Read tool with vision for each frame 7. Aggregate results → Compile all analyses into comprehensive report 8. Output report → Structured markdown with scores and recommendations 9. Cleanup → Remove temporary frames (unless --keep-frames)
Parameters
Required
- •video_path: Path to video file (relative or absolute)
Optional
- •--mode: Sampling mode
- •
quick- 5 frames (start, 25%, 50%, 75%, end) - •
standard- 10 frames evenly distributed (default) - •
detailed- 20 frames for thorough analysis - •
custom- User-specified timestamps via --frames
- •
- •--frames: Comma-separated timestamps in seconds (e.g., "0,5,10,15")
- •--output: Output directory for extracted frames (default: temp dir)
- •--keep-frames: Keep extracted frames after analysis (default: delete)
- •--focus: Analysis focus area
- •
ui- UI/UX elements and readability - •
aesthetics- Visual design, color, composition - •
technical- Technical quality, artifacts, encoding - •
storytelling- Narrative flow and pacing - •
all- Comprehensive analysis (default)
- •
Usage Examples
# Standard analysis (10 frames) /video-analysis examples/my-video/out/video.mp4 # Quick overview (5 frames) /video-analysis examples/my-video/out/video.mp4 --mode quick # Detailed analysis (20 frames) /video-analysis examples/my-video/out/video.mp4 --mode detailed # Custom timestamps /video-analysis examples/my-video/out/video.mp4 --frames 0,5,10,15,20,25,30 # Focus on UI/UX only /video-analysis examples/my-video/out/video.mp4 --focus ui # Keep frames for manual review /video-analysis examples/my-video/out/video.mp4 --keep-frames --output analysis-output/
Implementation Instructions
Step 1: Parse Arguments and Validate
- •Extract video_path from user request
- •Parse optional parameters (mode, frames, output, keep-frames, focus)
- •Set defaults: mode=standard, focus=all, keep-frames=false
- •Convert relative path to absolute path if needed
Validation:
# Check if video file exists if [ ! -f "$video_path" ]; then echo "Error: Video file not found: $video_path" exit 1 fi # Check if FFmpeg is installed if ! command -v ffmpeg &> /dev/null; then echo "Error: FFmpeg is required but not found." echo "Install from: https://ffmpeg.org/download.html" exit 1 fi # Check if ffprobe is installed if ! command -v ffprobe &> /dev/null; then echo "Error: ffprobe is required but not found." echo "ffprobe is part of FFmpeg. Install from: https://ffmpeg.org/download.html" exit 1 fi
Step 2: Analyze Video Metadata
Use ffprobe to get video information:
# Check if file has a video stream video_stream=$(ffprobe -v error -select_streams v:0 \ -show_entries stream=codec_type \ -of default=noprint_wrappers=1:nokey=1 "$video_path" 2>/dev/null) if [ "$video_stream" != "video" ]; then echo "Error: File does not contain a video stream." echo "This appears to be an audio-only file or invalid video." echo "Supported formats: MP4, MOV, AVI, WebM, MKV" exit 1 fi # Get duration in seconds duration=$(ffprobe -v error -show_entries format=duration \ -of default=noprint_wrappers=1:nokey=1 "$video_path") if [ -z "$duration" ] || [ "$(echo "$duration <= 0" | bc -l)" -eq 1 ]; then echo "Error: Unable to read video duration." echo "The file may be corrupted or in an unsupported format." exit 1 fi # Get fps (convert fraction to decimal) fps_fraction=$(ffprobe -v error -select_streams v:0 \ -show_entries stream=r_frame_rate \ -of default=noprint_wrappers=1:nokey=1 "$video_path") # Convert fraction (e.g., "30/1") to decimal using bc fps=$(echo "scale=6; $fps_fraction" | bc -l) if [ -z "$fps" ] || [ "$(echo "$fps <= 0" | bc -l)" -eq 1 ]; then echo "Error: Unable to read video frame rate." exit 1 fi # Get resolution resolution=$(ffprobe -v error -select_streams v:0 \ -show_entries stream=width,height \ -of csv=s=x:p=0 "$video_path") # Get total frame count (use scale=0 for integer result) total_frames=$(echo "scale=0; ($duration * $fps) / 1" | bc)
Step 3: Calculate Frame Extraction Timestamps
For quick mode (5 frames):
# Calculate timestamps using bc with proper scale timestamps=( "0" "$(echo "scale=3; $duration * 0.25" | bc)" "$(echo "scale=3; $duration * 0.5" | bc)" "$(echo "scale=3; $duration * 0.75" | bc)" "$(echo "scale=3; $duration * 0.99" | bc)" )
For standard mode (10 frames):
# Calculate interval (9 intervals = 10 frames)
interval=$(echo "scale=3; $duration / 9" | bc)
# Build array of timestamps
timestamps=()
for i in {0..9}; do
timestamp=$(echo "scale=3; $i * $interval" | bc)
timestamps+=("$timestamp")
done
For detailed mode (20 frames):
# Calculate interval (19 intervals = 20 frames)
interval=$(echo "scale=3; $duration / 19" | bc)
# Build array of timestamps
timestamps=()
for i in {0..19}; do
timestamp=$(echo "scale=3; $i * $interval" | bc)
timestamps+=("$timestamp")
done
For custom mode:
# Parse comma-separated timestamps from --frames argument
IFS=',' read -ra custom_timestamps <<< "$frames_arg"
# Validate each timestamp
timestamps=()
for ts in "${custom_timestamps[@]}"; do
# Remove whitespace
ts=$(echo "$ts" | xargs)
# Validate timestamp is numeric
if ! [[ "$ts" =~ ^[0-9]+\.?[0-9]*$ ]]; then
echo "Error: Invalid timestamp '$ts' - must be numeric"
exit 1
fi
# Check timestamp is within video duration
if [ "$(echo "$ts > $duration" | bc -l)" -eq 1 ]; then
echo "Error: Timestamp ${ts}s exceeds video duration of ${duration}s"
exit 1
fi
timestamps+=("$ts")
done
# Check max frame limit (100 frames)
if [ "${#timestamps[@]}" -gt 100 ]; then
echo "Error: Too many frames requested (${#timestamps[@]})"
echo "Maximum: 100 frames. Consider using standard or detailed mode."
exit 1
fi
# Check minimum frame count
if [ "${#timestamps[@]}" -eq 0 ]; then
echo "Error: No valid timestamps provided"
echo "Example: --frames 0,5,10,15,20"
exit 1
fi
Step 4: Extract Frames Using FFmpeg
Recommended approach (individual frame extraction):
# Create temp directory with error checking
temp_dir=$(mktemp -d) || {
echo "Error: Failed to create temporary directory"
exit 1
}
# Set up cleanup trap for interrupts
cleanup() {
if [ "$keep_frames" = false ] && [ -d "$temp_dir" ]; then
rm -rf "$temp_dir"
fi
}
trap cleanup EXIT INT TERM
# Extract each frame
for i in "${!timestamps[@]}"; do
timestamp=${timestamps[$i]}
# Convert timestamp to HH:MM:SS.MS format using bc for all calculations
hours=$(echo "scale=0; $timestamp / 3600" | bc)
minutes=$(echo "scale=0; ($timestamp % 3600) / 60" | bc)
seconds=$(echo "scale=3; $timestamp - ($hours * 3600) - ($minutes * 60)" | bc)
time_formatted=$(printf '%02d:%02d:%06.3f' "$hours" "$minutes" "$seconds")
# Extract frame
ffmpeg -ss "$time_formatted" -i "$video_path" \
-vframes 1 -q:v 2 \
"$temp_dir/frame_$(printf '%03d' $i).png" \
-loglevel error
# Check if extraction succeeded
if [ ! -f "$temp_dir/frame_$(printf '%03d' $i).png" ]; then
echo "Error: Failed to extract frame at timestamp $timestamp"
exit 1
fi
done
echo "Successfully extracted ${#timestamps[@]} frames"
Step 5: Analyze Frames Sequentially
For each extracted frame:
- •
Load frame using Read tool
codeRead tool: $temp_dir/frame_XXX.png
- •
Provide context to Claude vision
- •Frame number
- •Timestamp in video (timestamp_seconds, formatted as MM:SS)
- •Total video duration for context
- •Analysis focus (from --focus parameter)
- •
Request specific feedback based on focus:
For focus=all (default):
codeAnalyze this frame from a video at timestamp MM:SS (X.Xs / Y.Ys total). Evaluate: 1. Visual Quality: Clarity, sharpness, color balance, lighting 2. UI/UX Elements: Readability, contrast, layout, usability 3. Aesthetic Coherence: Design consistency, color palette, styling 4. Composition: Scene layout, visual hierarchy, balance 5. Technical Issues: Artifacts, glitches, encoding problems Provide: - What works well in this frame - Specific suggestions for improvement - A score out of 10
For focus=ui:
codeAnalyze the UI/UX elements in this frame at timestamp MM:SS. Focus on: - Text readability and contrast - Button and interactive element visibility - Layout and spacing effectiveness - Visual hierarchy and information architecture - Accessibility considerations
For focus=aesthetics:
codeAnalyze the visual aesthetics of this frame at timestamp MM:SS. Focus on: - Color palette and harmony - Visual style consistency - Design cohesion - Typography choices - Overall visual appeal
For focus=technical:
codeAnalyze the technical quality of this frame at timestamp MM:SS. Focus on: - Image sharpness and clarity - Compression artifacts - Color banding or posterization - Motion blur appropriateness - Encoding quality issues
For focus=storytelling:
codeAnalyze the narrative elements of this frame at timestamp MM:SS. Focus on: - Visual storytelling effectiveness - Pacing and timing - Scene composition for narrative - Emotional impact - Continuity with surrounding context
- •
Store analysis result
- •Save structured feedback for aggregation
- •Track individual frame scores
- •Note recurring themes (positive and negative)
Step 6: Aggregate Results
After analyzing all frames:
- •
Compile scene-by-scene analysis
- •Organize by timestamp
- •Include individual frame scores and feedback
- •
Identify patterns and themes
- •Strengths that appear consistently
- •Issues that recur across frames
- •Overall visual consistency
- •
Calculate overall score
- •Average frame scores
- •Weight by importance (critical issues reduce score more)
- •Consider consistency across frames
- •
Generate recommendations
- •Prioritize actionable suggestions
- •Focus on highest-impact improvements
- •Group by category (visual, technical, UX, etc.)
Step 7: Output Comprehensive Report
Generate structured markdown report:
# Video Analysis Report **Video:** [file path] **Duration:** X seconds (Y frames @ Z fps) **Resolution:** WxH **Analyzed:** YYYY-MM-DD HH:MM:SS **Frames Analyzed:** N (mode: standard/quick/detailed/custom) **Analysis Focus:** all/ui/aesthetics/technical/storytelling --- ## Overall Assessment: X.X/10 [2-3 sentence high-level summary of video quality, noting key strengths and primary areas for improvement] --- ## Detailed Scene Analysis ### Frame 1 (0:00 - 0.0s) **Scene:** [Brief description of what's visible] **What Works:** - [Specific positive observation] - [Another strength] **Suggestions:** - [Specific improvement recommendation] **Score:** X/10 --- ### Frame 2 (0:XX - X.Xs) [Repeat structure] --- [Continue for all frames...] --- ## Summary ### Strengths 1. [Recurring positive pattern or consistent strength] 2. [Another overall strength] 3. [Additional positive observation] ### Areas for Improvement 1. [Key issue or weakness observed across frames] 2. [Another improvement opportunity] 3. [Additional suggestion] ### Technical Quality Breakdown - **Visual Consistency:** X/10 - [Brief note] - **Readability:** X/10 - [Brief note] - **Aesthetic Coherence:** X/10 - [Brief note] - **Technical Execution:** X/10 - [Brief note] --- ## Recommendations ### High Priority 1. [Most impactful actionable suggestion] 2. [Second most important recommendation] ### Medium Priority 1. [Helpful but less critical suggestion] 2. [Another medium priority item] ### Nice to Have 1. [Polish or enhancement suggestion] 2. [Optional improvement] --- ## Appendix **Extracted Frames:** [temp_dir path if --keep-frames, otherwise "Deleted after analysis"] **Analysis Duration:** X minutes Y seconds **Sampling Strategy:** [Description of frame selection approach] **Focus Area:** [all/ui/aesthetics/technical/storytelling]
Step 8: Cleanup
# Cleanup is handled by trap set in Step 4
# The trap ensures cleanup happens even on interrupt/error
# If --keep-frames IS set and --output specified, move frames before exit
if [ "$keep_frames" = true ] && [ -n "$output_dir" ]; then
mkdir -p "$output_dir" || {
echo "Error: Cannot create output directory: $output_dir"
exit 1
}
mv "$temp_dir"/* "$output_dir/" || {
echo "Warning: Some frames may not have been moved to $output_dir"
}
echo "Frames saved to: $output_dir"
else
echo "Temporary frames cleaned up"
fi
Error Handling
Required Error Checks
1. FFmpeg not installed:
Error: FFmpeg is required but not found. FFmpeg is needed to extract frames from video files. Install from: https://ffmpeg.org/download.html Installation quick start: - macOS: brew install ffmpeg - Ubuntu/Debian: sudo apt-get install ffmpeg - Windows: Download from ffmpeg.org
2. Video file not found:
Error: Video file not found: [path] Please check: - File path is correct - File exists at the specified location - You have read permissions for the file Tip: Use absolute paths or paths relative to current directory.
3. Audio-only file (no video stream):
Error: File does not contain a video stream. This appears to be an audio-only file or invalid video. Supported formats: MP4, MOV, AVI, WebM, MKV If this is an audio file, this tool only analyzes video content. Try: - Providing a video file instead - Converting audio to video format with visualizations - Using an audio analysis tool
4. Invalid video format or corrupted file:
Error: Unable to read video duration. The file may be corrupted or in an unsupported format. This usually indicates: - Corrupted or incomplete video file - Unsupported codec or container - File is not actually a video file Try: - Playing the video in VLC or another media player to verify it works - Re-encoding to a standard format: ffmpeg -i input.mov -c:v libx264 -c:a aac output.mp4 - Using a different video file
5. Temporary directory creation failed:
Error: Failed to create temporary directory This usually indicates: - Insufficient disk space - Permission issues with /tmp directory - System resource limitations Try: - Freeing up disk space - Checking /tmp directory permissions - Using --output to specify an alternative directory
6. Frame extraction failed:
Error: Failed to extract frame at timestamp X.XXs Possible causes: - Video file is corrupted at that timestamp - Insufficient disk space during extraction - Invalid timestamp (beyond video duration) - FFmpeg encoding error Try: - Using different timestamps with --frames - Checking video file integrity - Freeing up disk space - Re-encoding video: ffmpeg -i input.mp4 -c copy output.mp4
7. Invalid custom timestamps (non-numeric):
Error: Invalid timestamp 'abc' - must be numeric Requirements for --frames parameter: - Comma-separated numeric values: "0,10,20,30" - Timestamps in seconds - Can include decimals: "0,5.5,10,15.25,20" - Must be within video duration Example: --frames 0,5,10,15,20
8. Timestamp exceeds video duration:
Error: Timestamp 65s exceeds video duration of 45s All timestamps must be within the video duration. To fix: - Use timestamps <= 45s - Remove the invalid timestamp - Use automatic modes (quick/standard/detailed) instead Valid example for this video: --frames 0,10,20,30,40
9. Too many frames requested:
Error: Too many frames requested (150) Maximum: 100 frames. Consider using standard or detailed mode. Why the limit? - Processing 100+ frames takes hours - Storage requirements become excessive - Analysis quality plateaus beyond ~20 frames Recommended: - Quick mode (5 frames): Fast overview - Standard mode (10 frames): Balanced analysis - Detailed mode (20 frames): Thorough review - Custom mode: Target specific moments (max 100)
10. Output directory creation failed:
Error: Cannot create output directory: [path] Possible causes: - Parent directory doesn't exist - No write permissions - Invalid path characters Try: - Creating parent directories first: mkdir -p /path/to/parent - Using a different output path - Checking directory permissions
Performance Expectations
Analysis Duration (Standard Mode - 10 frames)
| Video Length | Frame Extraction | Analysis Time | Total Time |
|---|---|---|---|
| 15-30 sec | ~3-5 sec | ~15-20 min | ~15-20 min |
| 30-60 sec | ~5-10 sec | ~15-20 min | ~15-20 min |
| 1-3 min | ~5-10 sec | ~15-25 min | ~15-25 min |
| 3-5 min | ~5-15 sec | ~15-25 min | ~15-25 min |
Analysis time is primarily determined by number of frames, not video length
Frame Extraction Performance
- •Quick mode (5 frames): ~2-5 seconds
- •Standard mode (10 frames): ~5-10 seconds
- •Detailed mode (20 frames): ~10-20 seconds
Storage Requirements
Per frame: ~2-5 MB (PNG format, depends on resolution)
- •Quick mode: ~10-25 MB temporary storage
- •Standard mode: ~20-50 MB temporary storage
- •Detailed mode: ~40-100 MB temporary storage
Best Practices
When to Use Each Mode
Quick mode (5 frames):
- •Short videos (<30 seconds)
- •Initial rough assessment
- •When time is limited
- •Checking specific issue or aspect
Standard mode (10 frames) - Recommended:
- •Most use cases
- •Comprehensive analysis
- •Balanced speed/thoroughness
- •Videos 30 seconds to 5 minutes
Detailed mode (20 frames):
- •Long videos (>3 minutes)
- •Detailed quality review
- •Animation-heavy content
- •When thoroughness is critical
Custom mode:
- •Analyzing specific moments
- •Comparing before/after changes
- •Focus on known problem areas
- •Following up on specific feedback
Frame Sampling Strategy
The skill automatically distributes frames evenly across the video duration:
- •First frame: Always at 0s (opening)
- •Last frame: At 99% of duration (ending without black frames)
- •Middle frames: Evenly spaced intervals
This ensures comprehensive coverage of the entire video narrative.
Focus Area Selection
Use focus=ui when:
- •Creating UI/UX demos
- •Building tutorial videos
- •Reviewing interface designs
- •Checking text readability
Use focus=aesthetics when:
- •Reviewing visual design
- •Evaluating brand consistency
- •Checking color grading
- •Assessing overall visual appeal
Use focus=technical when:
- •Debugging encoding issues
- •Checking export quality
- •Identifying compression artifacts
- •Validating technical specifications
Use focus=storytelling when:
- •Reviewing narrative videos
- •Checking pacing and flow
- •Evaluating emotional impact
- •Assessing scene transitions
Use focus=all when:
- •Comprehensive review needed
- •First-time analysis
- •Not sure what to focus on
- •Want complete feedback
Tips for Better Results
- •
Use high-quality source videos:
- •Higher resolution = better analysis
- •Clean encoding = more accurate feedback
- •
Choose appropriate mode:
- •Don't over-sample (more frames ≠ better feedback)
- •Standard mode (10 frames) is usually optimal
- •
Keep frames for iteration:
- •Use --keep-frames when you plan to make changes
- •Compare before/after analyses
- •
Focus your analysis:
- •Use --focus to get targeted feedback
- •More specific = more actionable suggestions
- •
Analyze at key milestones:
- •First draft: Quick mode for initial feedback
- •Revision: Standard mode for comprehensive review
- •Final polish: Detailed mode for quality assurance
Known Limitations
- •No audio analysis: This skill only analyzes visual content
- •No motion analysis: Each frame analyzed independently (no motion tracking)
- •Sampling gaps: Even detailed mode only sees ~0.5-3% of total frames
- •Context limitations: Individual frames lack full context of surrounding motion
- •Processing time: Analysis of 10+ frames takes 15-30 minutes
Future Enhancements
Potential improvements for future versions:
- •Scene detection for intelligent frame selection
- •Motion analysis comparing consecutive frames
- •Audio transcription and sync analysis
- •Batch processing for multiple videos
- •Comparative analysis (before/after iterations)
- •Export to JSON/HTML formats
- •Integration with video MCP servers
Related Documentation
- •Specification: docs/context/video-analysis-skill-specification.md
- •Prototype research: docs/research/2026-02-06-video-analysis-implementation-research.md
- •AI video analysis capabilities: docs/research/2026-02-06-ai-video-analysis-capabilities.md
- •FFmpeg documentation: https://ffmpeg.org/documentation.html
Quick Reference
Minimal invocation:
/video-analysis path/to/video.mp4
Most common options:
# Quick check (5 frames) /video-analysis video.mp4 --mode quick # Standard analysis (10 frames) /video-analysis video.mp4 # Detailed review (20 frames) /video-analysis video.mp4 --mode detailed # UI focus /video-analysis video.mp4 --focus ui
Full example:
/video-analysis examples/my-project/out/final-video.mp4 \ --mode detailed \ --focus all \ --keep-frames \ --output analysis-2026-02-06/