Skill Create Flow
Build a new skill using a repeatable, high-signal workflow that separates methodology discovery from packaging**—without requiring any other skills to be installed or invoked.
Best for
This workflow excels at creating procedural agent skills where the value comes from following a structured methodology rather than raw knowledge retrieval.
Ideal skill types:
- •Methodology-based: Debugging, code reviews, writing specs, systematic problem-solving
- •Decision-heavy: Architecture decisions, UX design, business strategy, negotiation
- •Multi-step workflows: Defined sequence of operations with clear deliverables
- •Quality-focused: Where "excellent vs mediocre" output is distinguishable
Less ideal for:
- •Pure reference/lookup skills (use knowledge retrieval instead)
- •Simple one-shot commands (use direct tool invocation instead)
- •Skills requiring external dependencies (this flow produces standalone skills)
Scope & Outputs
You are producing a new skill folder (with its own SKILL.md and optional bundled resources). This flow helps you:
- •Narrow the intent until “excellent vs mediocre” is judgeable.
- •Extract expert frameworks (when needed).
- •Generate a production-ready, testable skill spec and artifacts.
Target artifacts (recommended)
- •
SKILL.md - •
examples/(1 tiny example) - •
tests/(3–5 eval prompts or scenarios) - •
index_entry.json(if you maintain a skill index) - •
CHANGELOG.md(optional but helpful for iteration)
Decision Rules (Pick the Track)
Track A — Non-technical / judgment-heavy (default if unsure)
Examples: writing, sales, hiring, product decisions, strategy, negotiation.
Use the full flow below (steps 1–6).
Track B — Technical / objective correctness dominates
Examples: code generation patterns, CLI automation, infrastructure scripts, data pipelines.
Usually shorten Step 2 (“expert frameworks”) and spend more time on Step 5 (“validation”).
Workflow (Standalone)
Step 1 — Lock the intent (narrowing)
Converge from broad → specific using this 5-layer funnel:
- •Domain: what domain is the skill for?
- •Context (5W1H): who/what/where/when/why/how constraints.
- •Comparative choice: pick the closest of 2–3 similar scenarios.
- •Boundaries (via negativa): confirm what is explicitly excluded.
- •Concrete anchor: one real recent case with inputs + desired output.
Deliverable from Step 1
- •A 5–10 line “skill brief”:
- •target user + context
- •inputs the user will provide
- •outputs the skill must produce
- •what’s explicitly out of scope
- •quality bar (“excellent output looks like…”)
Step 2 — Extract expert frameworks (masters-level)
Produce:
- •1–3 core frameworks/checklists experts use in this exact scenario.
- •Common failure modes + counterexamples.
- •Minimal decision rules (when to do A vs B; when to stop; when to ask for more info).
Stop condition
- •You can explain the method in a way that is: specific, testable, and not generic advice.
Step 3 — Draft the trigger contract (frontmatter)
Write a frontmatter description that triggers reliably:
- •Include 3–6 concrete “when to use” phrases users will actually type.
- •Include exclusions if you keep getting false triggers.
- •Avoid “does everything”.
Template
--- name: <skill-name> description: <What it does>. Use when <trigger 1>, <trigger 2>, or <trigger 3>. Avoid when <non-goal>. ---
Step 4 — Write SKILL.md (lean + procedural)
Keep SKILL.md short and procedural:
- •Decision rules first.
- •Output contract (what files/sections are produced) is explicit.
- •Put long templates into
references/if they exceed ~100 lines.
Minimum recommended sections:
- •Scope & boundaries
- •Inputs (what the user must provide)
- •Procedure (step-by-step)
- •Decision rules & failure handling
- •Output contract
- •Test prompts (what you will validate with)
Step 5 — Create supporting artifacts (examples/evals/index/changelog)
A) Tiny example (examples/<name>-example.txt)
- •≤30 lines; show the most common invocation and expected output shape.
B) Evals / test prompts (tests/evals_<name>.yaml)
Include 3–5 scenarios:
- •happy path
- •missing input
- •edge case
- •out-of-scope
- •safety/constraint case (if relevant)
Minimal template:
cases:
- name: happy_path
prompt: |
<a realistic user request that should trigger the skill>
expect:
- <must-have output property 1>
- <must-have output property 2>
C) Index entry (index_entry.json) (only if you keep an index)
{
"slug": "<skill-name>",
"name": "<Human title>",
"summary": "<=160 chars>",
"entry": "skills/<skill-name>/SKILL.md"
}
D) CHANGELOG.md (optional)
Use it if you expect iteration; otherwise skip.
Step 6 — Validate via simulations
Dry-run the new skill against your test prompts:
- •Does it trigger when it should?
- •Does it avoid false triggers?
- •Does it produce the promised output contract?
If it fails, iterate:
- •tighten frontmatter
description - •move long content into references
- •add/adjust decision rules
Prompt Templates
Template 1 — Step 1 (narrowing)
“I want a skill for: <broad goal>. Help me narrow it using this 5-layer funnel: domain → 5W1H → pick closest scenario → boundaries → one real recent case. Ask only the minimum questions (1–3 at a time).”
Template 2 — Step 2 (expert frameworks)
“For the exact scenario we narrowed to, give me 1–3 expert-level frameworks/checklists, decision rules, and common failure modes. Keep it specific and testable (no generic advice).”
Template 3 — Step 4/5 (write artifacts)
“Using the skill brief + frameworks, draft:
- •SKILL.md (lean + procedural),
- •1 tiny example,
- •3–5 eval prompts,
- •optional index entry + changelog. Keep templates short; push long templates into references/.”
Notes (Keep It Lean)
- •Prefer decision rules over long explanation.
- •Prefer one tiny example over many medium ones.
- •If you feel tempted to add a long “how skills work” section, don’t—keep the flow operational.