Review Skill

Target Skill

The target skill to review is: $ARGUMENTS

If $ARGUMENTS is empty, ask the user which skill to review.

Mode Selection

Mode	Trigger	Action
Review + Auto-Fix (default)	User says "review", "check", "grade", or gives no mode	Run full deep review, then auto-fix all findings
Review Only	User says "report only", "no fix", "read-only"	Run full deep review, report only, no changes
Auto-Fix Only	User says "fix", "improve", "refactor", "auto-fix"	Skip report, apply fixes directly
External Review	User says "external", target is a GitHub URL	Clone to /tmp/, full deep review, report only (read-only)
Auto-PR	User says "PR", "contribute", "auto-pr"	Fork, full deep review, fix, submit PR

When no mode keyword is present, default to Review + Auto-Fix. The deep review always runs in every mode. Auto-fix always follows the deep review unless the user explicitly requests report-only output.

Setup (Optional)

Install create-skill for automated validation: see references/setup.md

All modes work without it using manual evaluation.

Mode 1: Review + Auto-Fix (Default)

Run a full deep review across every evaluation dimension, then automatically fix all findings.

Step 1: Run automated validation (if create-skill installed):

bash

python3 "$CREATE_SKILL"/scripts/quick_validate.py <target-skill>
python3 "$CREATE_SKILL"/scripts/security_scan.py <target-skill> --verbose

Step 2: Structural evaluation -- Read references/evaluation_checklist.md and check every item against the target skill. Record pass/fail for each item with the file path and line number of the finding.

Step 3: Content quality evaluation -- Read references/content-quality-checklist.md and evaluate all 8 dimensions (degrees of freedom, conciseness, actionability, options overload, script quality, feedback loops, consistency, time-sensitive content). Record findings per dimension.

Step 4: Deep review -- Read references/research-backed-criteria.md and check all 6 criteria. Record a pass/fail verdict for each:

•XML tag usage
•Example quality (3-5 diverse examples)
•Defect taxonomy (specification, input, structure, context, performance, maintainability)
•Anti-patterns (OWASP, vendor docs, academic)
•Formatting effectiveness
•HELM-inspired metrics (clarity, actionability, robustness, maintainability, safety)

Step 5: Generate report as markdown with:

•Executive summary table (aspect, grade, notes)
•Section-by-section findings with file paths and line numbers
•Deep review results table (criterion, verdict, evidence)
•Combined grade using the unified rubric from references/evaluation_checklist.md
•Recommended fixes ranked by severity (major first, then minor)

Step 6: Verify report before presenting:

• Every finding has a file path and line number
• Grade matches rubric criteria
• Fixes are actionable (no "consider" or "ensure")
• Deep review covers all 6 criteria from references/research-backed-criteria.md

Step 7: Present report, then proceed to auto-fix. After showing the full review report, automatically apply all recommended fixes using the Auto-Fix procedure (Mode 2). Do not wait for user confirmation. The review informs the fix -- every finding from Steps 2-4 becomes a fix target.

<example> **Review + Auto-Fix Report Format:**

Skill Review: pdf

Executive Summary

Aspect	Grade	Notes
Frontmatter	A	Third-person description with triggers
Structure	B	487 lines -- close to 500-line limit
Content Quality	B	One decision point missing a default
Deep Review	B	Missing 2 example tags, no defect in other criteria
Scripts	A	Proper error handling throughout
Combined	B	One minor structural issue

Deep Review Results

Criterion	Verdict	Evidence
XML tag usage	Pass	`<instructions>` and `<example>` tags present
Example quality	Fail	Only 2 examples, need 3-5 diverse cases
Defect taxonomy	Pass	No specification, input, structure, context, performance, or maintainability defects
Anti-patterns	Pass	No OWASP, vendor, or academic anti-patterns
Formatting	Pass	Consistent Markdown + XML structure
HELM metrics	Pass	Clarity 5/5, Actionability pass, Robustness pass, Maintainability pass, Safety pass

Findings

1. Line count approaching limit (Minor)

File: SKILL.md (487 lines) Fix: Move the "Advanced Extraction" section (lines 320-410) to references/advanced-extraction.md.

2. Missing default for output format (Minor)

File: SKILL.md, line 145 Finding: Lists JSON, CSV, and Markdown output without recommending a default. Fix: Add "Default to Markdown. Use JSON when the user needs machine-readable output."

Recommended Fixes (by severity)

•Extract advanced section to references (structural)
•Add default output format recommendation (content)

Auto-Fix Applied

Proceeding to fix all findings above...

Changes summary: 2 issues fixed, 1 file reorganised, line count reduced from 487 to 395. </example>

</instructions> <instructions>

Mode 2: Auto-Fix

Automatically refactor a skill to meet best practices. When triggered by Mode 1 (Review + Auto-Fix), use the review findings as the fix list. When triggered standalone, run Steps 1-2 below to identify issues first.

code

Auto-Fix Progress:
- [ ] Step 1: Read SKILL.md and all loose files
- [ ] Step 2: Run evaluation, identify issues
- [ ] Step 3: Fix frontmatter (description, context: fork)
- [ ] Step 4: Create references/ folder if needed
- [ ] Step 5: Move content over 500 lines to references/
- [ ] Step 6: Move loose files to references/ with clear names
- [ ] Step 7: Update SKILL.md references section
- [ ] Step 8: Verify final line count under 500
- [ ] Step 9: Generate summary of changes (files modified, issues fixed, before/after line counts)

Auto-Fix Actions:

Issue	Automatic Fix
Description not third-person	Rewrite: "Processes...", "Extracts..."
Missing trigger conditions	Add "Use when..." clause
Missing `context: fork` (task-based skill)	Check for task-based signals (`<instructions>` tags, script references, `agent` field, 3+ numbered steps). Add to frontmatter only when signals are present.
SKILL.md over 500 lines	Extract sections to `references/`
Loose files in root	Move to `references/` with descriptive names
Duplicate reference files	Merge and deduplicate

Content Quality Fixes:

Issue	Automatic Fix
Vague instructions ("consider", "ensure")	Rewrite with strong verbs ("check", "verify", "run")
Too many options without default	Add recommended default + escape hatch pattern
Missing feedback loop	Add validation checkpoint before destructive actions
Verbose explanations Claude knows	Flag for condensing (manual review)
Time-sensitive content	Flag for removal or add deprecation notice
Scripts with bare `except:`	Add specific error handling with recovery actions
No examples provided	Add 3-5 diverse `<example>` blocks
Plain text structure (no delimiters)	Add XML tags (`<instructions>`, `<context>`)
Over-specification ("MUST", "CRITICAL")	Use natural language; Claude follows clear instructions

<example> **Before/After: Auto-Fix on a bloated skill**

Before (SKILL.md, 580 lines):

yaml

---
name: data-export
description: "Export data from databases"
license: MIT
---

•No trigger conditions in description
•No context: fork despite script usage
•580 lines with inline SQL reference (lines 310-520)
•Vague step: "Ensure the export format is correct"
•3 loose files in root: formats.md, sql-ref.md, tips.md

After (SKILL.md, 340 lines):

yaml

---
name: data-export
description: "Exports data from SQL and NoSQL databases to CSV, JSON, or Parquet. Use when extracting datasets, scheduling recurring exports, or migrating between storage systems."
license: MIT
context: fork
agent: general-purpose
---

•Description rewritten: third-person verb + three trigger conditions
•context: fork added (scripts and <instructions> tags present)
•SQL reference extracted to references/sql-syntax.md (210 lines saved)
•Vague step rewritten: "Run python3 scripts/validate_schema.py against the output file"
•Loose files moved: formats.md → references/export-formats.md, sql-ref.md merged into references/sql-syntax.md, tips.md → references/troubleshooting.md

Changes summary: 6 issues fixed, 3 files reorganised, line count reduced from 580 to 340. </example>

<example> **File Naming When Moving** - `learn.md` → `references/learning-guide.md` - `reference.md` → `references/[descriptive-name].md` - `ui-reference.md` + `official-ui-reference.md` → `references/cli-reference.md` (merge) </example> </instructions> <instructions>

Mode 3: External Review

Read references/mode-external-review.md for the full procedure. Clone the target to /tmp/review-target, run the same three evaluation checks as Mode 1 Steps 2-4, generate a read-only improvement report, then clean up.

</instructions> <instructions>

Mode 4: Auto-PR

Read references/mode-auto-pr.md for the full procedure. Fork the repository, run a full deep review, apply auto-fix, pass the self-review respect check, then submit a PR using references/pr_template.md.

</instructions>

References

File	Purpose	Used By
`references/evaluation_checklist.md`	Structural validation + unified grading rubric	Review, Auto-Fix
`references/content-quality-checklist.md`	Content effectiveness (8 dimensions)	Review, Auto-Fix
`references/research-backed-criteria.md`	Deep review with academic citations	All modes (always runs)
`references/script-quality.md`	Script error handling, constants	Review, Auto-Fix
`references/feedback-loops.md`	Multi-step workflow validation	Review, Auto-Fix
`references/mode-external-review.md`	Full External Review procedure	External Review
`references/mode-auto-pr.md`	Full Auto-PR procedure with respect checks	Auto-PR
`references/pr_template.md`	PR description template	Auto-PR
`references/marketplace_template.json`	marketplace.json template	Auto-PR
`references/sources.md`	Bibliography	Review (deep)
`references/setup.md`	create-skill installation	Setup

Official Best Practices