Reviewing Skills
Overview
Review a target skill folder and report whether it follows best practices. Combine deterministic checks from scripts/check_skill_practices.py with manual review using review-checklist.
Prerequisites
- •
uvis the default runner for executing scripts. - •
python3can be used as a fallback whenuvis unavailable. - •
PyYAMLis optional;scripts/check_skill_practices.pyhas a built-in fallback parser when the package is not installed.
Inputs
Collect these inputs before starting:
- •Target skill directory path
- •Scope of review: full review or metadata-only quick pass
- •Optional constraints from the user (for example, strict 500-line cap or strict naming conventions)
Review Workflow
Copy this checklist and mark progress:
Review progress: - [ ] Step 1: Inventory skill files - [ ] Step 2: Run automated checks - [ ] Step 3: Perform manual quality review - [ ] Step 4: Produce findings-first report - [ ] Step 5: Validate fixes and re-run checks (required)
Step 1: Inventory skill files
Inspect the target folder and confirm presence of:
- •
SKILL.md(required) - •
agents/openai.yaml(recommended for UI metadata) - •
scripts/,references/,assets/(optional, if relevant)
If SKILL.md is missing, stop and report a blocking issue.
Step 2: Run automated checks
Run these commands from the reviewing-skills directory.
Run (recommended):
uv run scripts/check_skill_practices.py <target-skill-path>
Fallback:
python3 scripts/check_skill_practices.py <target-skill-path>
The script checks objective rules such as frontmatter constraints, line budget, Windows-style paths, nested reference depth, and missing table of contents in long reference files.
Step 3: Perform manual quality review
Use review-checklist for qualitative review:
- •Conciseness and signal-to-noise quality
- •Appropriate degrees of freedom
- •Workflow clarity and validation loops
- •Clarity of triggers in
description - •Examples, terminology consistency, and anti-patterns
- •Coverage against representative scenarios in evals
Mark each issue with severity:
- •
high: likely to break triggering, safety, or reliability - •
medium: quality regression or confusion risk - •
low: polish or maintainability improvement
Step 4: Produce findings-first report
Return findings first, ordered by severity, with file references.
For Japanese review output with checklist tracking, use:
Use this format:
Findings 1. [high] <short title> - Evidence: <path:line> - Why it matters: <impact> - Fix: <concrete edit> 2. [medium] <short title> - Evidence: <path:line> - Why it matters: <impact> - Fix: <concrete edit> Open Questions 1. <question if information is missing> Summary - Passed checks: <count> - Failed checks: <count> - Manual risks: <count>
If no issues are found, state that explicitly and include residual risks (for example, "multi-model testing evidence not provided").
Step 5: Validate fixes and re-run checks (required)
If you propose any concrete fix:
- •Apply the fix (or provide exact patch-ready edits).
- •Re-run automated checks on the updated target.
- •Re-review affected checklist items.
- •Include a diff summary in the report with changed files and key line-level edits.
When no fix is applied, explicitly state that no re-run diff is available.