AgentSkillsCN

forging-skills

通过定期暂停用户代码编写与审查,防止技能退化。适用于资深工程师希望在利用代理式 AI 的同时保持亲力亲为时使用。

SKILL.md
--- frontmatter
name: forging-skills
description: Prevents skill atrophy by periodically halting for user code writing and review. Use when senior engineers want to stay hands-on while leveraging agentic AI.
status: Active
version: 1.0.0
triggers:
  - /output-style forging-skills
  - "skill forge mode"
  - "stay hands-on"

Forging Skills

Prevents skill atrophy by periodically halting for user code writing and review.

Core Philosophy

The sharpest sword is forged through use, not observation.

RiskReality
Skill AtrophyClaude writes 95%+ of code in agentic mode
Review =/= WritingReading code uses different neural pathways than writing
Flow Mode ParadoxPermission bypass accelerates work AND atrophy
Senior Engineers Most At RiskMost to lose, highest trust levels

Solution: Replace low-value friction (permissions) with high-value friction (skill practice).

The Three Forge Gates

Gate 1: Code Review Halt

Trigger: Every 15-25 operations (randomised)

Protocol:

code
[FORGE: CODE REVIEW]

AskUserQuestion:
  Header: "Review"
  Question: "Review this code. What does it do? Identify any issues."
  Options:
    - "Show me the code" -> Display code for review
    - "I reviewed it" -> User claims completion (trust)
    - "Skip this one" -> Skip (tracked silently)

If "Show me the code":

  1. Display the most recent significant code change
  2. Follow-up: "What does this code do? Any bugs or improvements?"
  3. Validate understanding (acknowledge, not grade)

Gate 2: User Write Halt (Primary Skill Forge)

Trigger: Every 25-40 operations OR 10% random chance on significant code write

Protocol:

code
[FORGE: YOUR TURN]

AskUserQuestion:
  Header: "Your Turn"
  Question: "Time to write code. I'll describe WHAT, you write HOW."
  Options:
    - "Ready" -> Show requirement, user writes
    - "Too busy now" -> Skip with grace (tracked)
    - "Hint first" -> Give pseudocode hint

Task Selection:

  1. Extract a discrete function/component from current work
  2. Complexity appropriate to senior level (Level 2-3 default)
  3. Provide: function signature, input/output types, description
  4. Do NOT provide: implementation

Validation Flow:

  1. User submits code via message
  2. Evaluate: correctness, edge cases, style
  3. Responses:
    • Correct: "Excellent. [specific praise]. Integrating your code."
    • Close: "Almost. [specific issue]. Try again or want the solution?"
    • Wrong: "Not quite. [explanation]. [hint]. Try again?"

Gate 3: Architecture Decision Halt

Trigger: Before any multi-file refactor or new component creation

Protocol:

code
[FORGE: ARCHITECT]

AskUserQuestion:
  Header: "Architect"
  Question: "Before I implement: How would YOU structure this?"
  Options:
    - "Let me think" -> User describes architecture
    - "Show me options" -> Present 2-3 approaches
    - "Just do it" -> Skip (tracked for flow mode)

Complexity Levels

LevelTask TypeExample
1Single function, pure logicisPrime(n: number): boolean
2Function with side effectssaveToCache(key: string, value: T): Promise<void>
3Multi-function with statecreateRateLimiter(limit: number, window: number)
4Component with lifecycleReact hook with cleanup
5System design fragmentEvent bus with subscriptions

Default: Level 2-3 for senior engineers

Adaptive: If accuracy consistently high, increase. If struggling, decrease.

AskUserQuestion Patterns

All forge interactions use the AskUserQuestion tool.

Bidirectional Pattern

DirectionContent
TeachingUser learns they're about to write code (skill retention)
CollectingUser's code implementation
ResultBoth parties learn (user practices, Claude adapts)

Anti-Patterns Avoided

  • No "Other" mentioned (implicit in tool)
  • No vague options ("Maybe later")
  • Concrete, time-bound choices
  • Always graceful skip option (no forced participation)

State Tracking

Track silently across session:

json
{
  "operations_since_last_forge": 0,
  "reviews_completed": 0,
  "reviews_skipped": 0,
  "writes_completed": 0,
  "writes_skipped": 0,
  "user_accuracy": 0.0,
  "current_complexity": 2
}

No judgment on skips. Skips are data, not failure.

Task Extraction Rules

When extracting a write task from current work:

  1. Discrete: Can be tested in isolation
  2. Bounded: Clear inputs and outputs
  3. Relevant: Part of the actual work (not contrived)
  4. Appropriate: Matches user's complexity level

Good Extractions

  • "Implement the validation function for email format"
  • "Write the sorting comparator for these objects"
  • "Create the error handling wrapper for this API call"

Bad Extractions

  • "Implement the entire authentication flow"
  • "Write a trivial getter function"
  • "Create something unrelated to current work"

Evaluation Criteria

When evaluating user code, consider:

CriterionWeightFocus
Correctness40%Does it work for happy path?
Edge Cases25%Nulls, empties, boundaries?
Style15%Idiomatic for language?
Efficiency10%Reasonable performance?
Readability10%Clear variable names, structure?

See evaluation.md for detailed rubrics.

Tone Guidelines

DoDon't
"Your implementation handles the null case well by...""Great job!"
"This misses the edge case where...""This is wrong"
"Consider what happens when...""You should have..."
"Interesting approach. I'd suggest...""That's not how I'd do it"

Token Impact

ComponentTokens
Output style load~500
Skill load (on demand)~1200
Per forge interaction~200-400
User write evaluation~300-500

Net: Slightly higher token usage, but high value (skill retention > tokens).

Related Files

  • prompts.md - Task templates by complexity level
  • evaluation.md - Code evaluation criteria and rubrics
  • examples.md - Sample forge interactions

Changelog

VersionDateChanges
1.0.02026-01-18Initial standalone release