Skill Quality Validation
Ensures Claude Code skills follow best practices for discoverability, structure, content quality, and effectiveness. This skill provides checklists, patterns, and validation criteria for creating high-quality skills.
When to Use This Skill
Use this skill when you see these patterns:
✅ Yes, use this skill for:
- •"Create a new skill for [topic]"
- •"Review this skill for quality"
- •"Why isn't my skill being invoked?"
- •"Improve this skill's structure"
- •"Prepare this skill for sharing"
- •"Debug skill invocation issues"
- •"Make this skill more effective"
❌ No, use different skills for:
- •Writing skill content (use topic-specific skills)
- •Testing specific functionality (use testing skills)
- •Code review (use code-review skills)
Quick Reference
Core Principles
Every skill must have:
- •✅ Specific description with trigger keywords (< 100 chars)
- •✅ Under 500 lines (split into directory if longer)
- •✅ Concrete examples (not abstract)
- •✅ Consistent terminology
- •✅ Progressive disclosure (most important first)
Red flags:
- •❌ Vague description like "Help with Python"
- •❌ Single file over 500 lines
- •❌ Abstract guidance without examples
- •❌ Mixing terminology (e.g., "commit" and "change" without explanation)
- •❌ Time-sensitive info (e.g., "new tool just released")
Quality Checklist Workflow
When creating or reviewing a skill, copy this checklist and follow the steps:
Skill Quality Review Progress: - [ ] Step 1: Verify description and metadata - [ ] Step 2: Check structure and organization - [ ] Step 3: Validate content quality - [ ] Step 4: Review code and scripts (if applicable) - [ ] Step 5: Test across models - [ ] Step 6: Perform real usage testing
Step 1: Verify Description and Metadata
Check the YAML frontmatter:
- • Description includes specific trigger keywords (what users will say)
- • Description explains WHAT the skill does and WHEN to use it
- • Description is in third person ("Validates...", not "Apply...")
- • Description under 1024 characters
- • Priority is set appropriately (5-7 for most skills)
- • Name uses lowercase, hyphens, no reserved words
If checks fail: Update frontmatter before proceeding.
Step 2: Check Structure and Organization
Review file organization:
- • SKILL.md is under 500 lines
- • Uses directory structure if over 500 lines
- • "When to Use This Skill" section exists and is clear
- • Progressive disclosure: most important content first
- • Headers are descriptive and scannable
- • File references are one level deep maximum
If checks fail: Reorganize content or split into supporting files.
Step 3: Validate Content Quality
Review the skill content:
- • Examples are concrete and copy-pasteable
- • All code examples are runnable
- • Terminology is consistent throughout
- • No time-sensitive information (or properly isolated)
- • Workflows have clear numbered steps
- • Decision trees for complex choices
- • All placeholders are explained or replaced
If checks fail: Add missing examples or clarify instructions.
Step 4: Review Code and Scripts
If skill includes executable code:
- • Scripts solve problems (don't punt to Claude)
- • Error handling is explicit with helpful messages
- • All constants are justified (no "voodoo constants")
- • Dependencies are listed with install instructions
- • Paths use forward slashes (not backslashes)
- • Validation/feedback loops for critical operations
If checks fail: Improve error handling and documentation.
Step 5: Test Across Models
Test with all Claude models:
- • Tested with Haiku (simple case works)
- • Tested with Sonnet (moderate complexity works)
- • Tested with Opus (complex case works)
- • Skill invoked correctly in all cases
- • Responses follow skill guidance consistently
If checks fail: Adjust description or add more explicit guidance.
Step 6: Perform Real Usage Testing
Test in actual workflows:
- • Fresh start test (new project, no external docs)
- • Colleague test (someone else uses it)
- • Different project test (verify it's project-agnostic)
- • Error path test (intentionally trigger failures)
If checks fail: Update skill based on observed issues.
File Structure
For skills under 500 lines:
my-skill.md # Single file
For skills over 500 lines:
my-skill/
├── SKILL.md # Main instructions (< 500 lines)
├── examples.md # Detailed examples
├── reference.md # API/command reference (optional)
└── scripts/ # Helper scripts (optional)
└── validate.py
Key principles:
- •SKILL.md always under 500 lines
- •Related files use UPPERCASE for visibility (FORMS.md, EXAMPLES.md)
- •Scripts in subdirectory, executed not loaded as context
- •Each file has single, clear purpose
Example from real skill:
pdf/
├── SKILL.md # Core PDF guidance
├── FORMS.md # Form-filling specific guidance
├── examples.md # Extended examples
└── scripts/
├── analyze_form.py # Utility script
└── fill_form.py # Form processor
Core Quality Standards
1. Description Quality
Format: Frontmatter YAML at top of SKILL.md
--- description: "Specific action + key terms + when to use" priority: 5 ---
Requirements:
- •Include key terms that trigger the skill
- •Explain both WHAT and WHEN
- •Keep under 100 characters
- •Use terms users naturally say
📖 See EXAMPLES.md for good/bad examples
2. Content Structure
SKILL.md must be:
- •Under 500 lines total
- •Well-organized with clear sections
- •Using progressive disclosure
- •Focused on one coherent topic
If exceeding 500 lines:
- •Split into directory structure
- •Keep core guidance in SKILL.md
- •Move detailed examples to examples.md
- •Move reference material to reference.md
- •Move scripts to scripts/ subdirectory
Progressive disclosure pattern:
# Skill Name Brief intro (1-2 sentences) ## When to Use Quick bullet list ## Quick Reference Most common cases with examples ## Detailed Guidance (Or link to examples.md) ## Advanced Patterns (Or link to patterns.md)
3. Terminology Consistency
Rules:
- •Use consistent terms throughout all files
- •Establish vocabulary early
- •Explain synonyms when first used
- •Don't mix related terms without explanation
📖 See EXAMPLES.md for patterns
4. Concrete Examples
Every pattern needs a real, runnable example.
Examples must:
- •Be copy-pasteable
- •Show actual code/commands
- •Include expected output
- •Demonstrate the principle
📖 See EXAMPLES.md for good/bad examples
5. File Reference Depth
Keep references one level deep:
See examples.md for detailed patterns # ✅ Good
See examples.md which references patterns.md which has code in scripts/ # ❌ Bad - too deep
6. Time-Sensitive Information
Isolate or avoid time-sensitive content:
## Current Best Practice (as of 2024) Use ast-grep for syntax-aware searches ## Legacy Patterns Previously, ripgrep was used...
📖 See EXAMPLES.md for deprecation patterns
Code and Script Quality
Scripts Should Solve Problems
Don't punt to Claude - solve the problem in the script:
- •✅ Validate and return specific errors
- •✅ Handle edge cases explicitly
- •✅ Provide actionable error messages
- •❌ Leave TODOs for Claude to figure out
- •❌ Generic "check this" functions
Error Handling
Every error path needs helpful messages:
except FileNotFoundError:
print("Error: jj not found. Install with: brew install jj")
sys.exit(1)
No Voodoo Constants
Justify all magic numbers:
TIMEOUT_SECONDS = 30 # API requests take 5-10s, allow 3x buffer
Package Verification
List all dependencies with install instructions:
## Dependencies Required: - `ast-grep` - Install: `brew install ast-grep` Verify: `which ast-grep`
📖 See EXAMPLES.md for detailed patterns
Workflow Quality
Clear Steps
Use numbered steps with verification:
1. **Create directory:** ```bash mkdir my-dir ```
Verify: ls my-dir
- •Create file: ...
### Decision Trees **Complex workflows need decision points:** ```markdown **Need X?** → Use tool A **Need Y?** → Use tool B **Need both?** → Use A then B
📖 See EXAMPLES.md for patterns
Testing
Every skill needs testing across:
- •Models: Haiku, Sonnet, Opus
- •Scenarios: Simple, edge case, complex
- •Real usage: New project, no external help
📖 See TESTING.md for detailed testing guidelines
Troubleshooting
Common issues:
- •Skill not being invoked → Check description keywords
- •Too broad → Split into focused skills
- •Too abstract → Add concrete examples
📖 See TROUBLESHOOTING.md for complete guide
Quality Self-Check
Before considering a skill complete, copy this checklist and verify each item:
Skill Quality Verification: - [ ] Can someone use this without follow-up questions? - [ ] Would this work in 6 months? - [ ] Are examples copy-pasteable and runnable? - [ ] Can you find guidance in < 30 seconds? - [ ] Are error messages helpful enough? - [ ] Does the description include key trigger terms? - [ ] Is SKILL.md under 500 lines? - [ ] Are file references one level deep? - [ ] Is terminology consistent throughout?
If any check fails:
- •Can't use without follow-up questions → Add more concrete examples
- •Won't work in 6 months → Isolate time-sensitive info in "Current Best Practice" sections
- •Examples not copy-pasteable → Complete all placeholders and add setup steps
- •Can't find guidance quickly → Improve headers and add table of contents
- •Error messages unclear → Add context, hints, and recovery steps
- •Description lacks triggers → Add specific terms users naturally say
- •SKILL.md too long → Split into directory with reference files
- •Deep file references → Consolidate or flatten structure
- •Inconsistent terminology → Choose one term and use everywhere
Evaluation Scenarios
Test this skill with these scenarios to ensure it works effectively:
Scenario 1: Simple Case - New Skill Creation
Input: "Help me create a new skill for managing Docker containers"
Expected behavior:
- •Skill is invoked and recognized
- •Provides description template with trigger keywords
- •Suggests file structure (single file vs directory)
- •Offers checklist for required sections
- •Reminds about concrete examples requirement
Verify:
- •Skill invocation happens automatically
- •Response includes specific checklist items
- •Guidance is actionable and clear
Scenario 2: Edge Case - Skill Not Being Invoked
Input: "My skill exists but Claude never uses it"
Expected behavior:
- •Skill is invoked and recognized
- •Diagnoses common invocation issues
- •Checks description for trigger keywords
- •Verifies file location and frontmatter format
- •Suggests testing phrases
Verify:
- •Troubleshooting steps are provided
- •Specific fixes offered for each issue
- •Testing methodology explained
Scenario 3: Complex Case - Comprehensive Skill Review
Input: "Review my python-scripts skill for quality and best practices"
Expected behavior:
- •Skill is invoked and recognized
- •Provides complete quality checklist
- •Reviews description, structure, examples, and testing
- •Identifies specific gaps or issues
- •Suggests prioritized improvements
- •References relevant sections of examples.md
Verify:
- •All quality dimensions covered
- •Specific, actionable feedback provided
- •Prioritization of issues clear
- •References to supporting documentation included
Additional Resources
- •EXAMPLES.md - Detailed good/bad examples for all principles
- •TESTING.md - Complete testing guidelines
- •TROUBLESHOOTING.md - Common issues and fixes