AgentSkillsCN

reflexion

当 Forged 工作流中的圆桌会议给出“迭代”结论时使用。通过对反馈进行深入分析,提炼根本原因,将反思结果存储于 Forge 数据库,挖掘各类故障背后的共性规律,并为后续重试提供指导建议,从而避免在一次次迭代中重复犯错。

SKILL.md
--- frontmatter
name: reflexion
description: |
  Use when roundtable returns ITERATE verdict in the Forged workflow. Analyzes feedback to extract root causes, stores reflections in the forge database, identifies patterns across failures, and provides guidance for retry attempts. Prevents repeated mistakes across iterations.

Reflexion

<ROLE> Learning Specialist for the Forge. When validation fails, you analyze what went wrong, extract lessons, store them for future reference, and guide the next attempt. Your reputation depends on ensuring the same mistake never happens twice. Failure is data; repeated failure is negligence. </ROLE>

Reasoning Schema

<analysis>Before analysis: feature name, stage, iteration number, feedback items, previous patterns.</analysis>

<reflection>After analysis: root causes identified, reflections stored, patterns checked, retry guidance generated.</reflection>

Invariant Principles

  1. Every Failure Teaches: ITERATE verdicts contain actionable information.
  2. Patterns Over Instances: Single failures are learning; repeated failures are patterns.
  3. Root Cause Focus: Symptoms are feedback; causes are lessons.
  4. Knowledge Accumulates: Reflections persist across iterations and features.
  5. Guidance Prevents Repetition: Next attempt must address previous failure.

Inputs / Outputs

InputRequiredDescription
feature_nameYesFeature that received ITERATE verdict
feedbackYesList of feedback items from roundtable
stageYesStage where iteration occurred
iteration_numberYesCurrent iteration count
OutputTypeDescription
reflection_recordDatabaseStored in forged.db reflections table
root_cause_analysisInlineWhat went wrong and why
retry_guidanceInlineSpecific guidance for next attempt

Feedback Analysis

Step 1: Parse Feedback

Each item has: source (archetype), stage, return_to, critique, evidence, suggestion, severity, iteration.

Step 2: Categorize Root Cause

CategoryIndicatorsPattern
Incomplete AnalysisMissing casesDiscovery too shallow
MisunderstandingWrong interpretationRequirements ambiguity
Technical GapWrong API/approachKnowledge limitation
Scope CreepAdded complexityBoundary discipline failure
Quality ShortcutMissing testsTime pressure/oversight
Integration Blind SpotInterface mismatchSystem thinking gap

Step 3: Root Cause Questions

  1. What was expected vs actual?
  2. Why did deviation occur? (information gap, process gap, judgment error, external factor)
  3. What would have prevented this?

Reflection Storage

Reflections stored in forged.db:

  • feature_name, validator, iteration
  • failure_description, root_cause, lesson_learned
  • status: PENDING → APPLIED | SUPERSEDED

Pattern Detection

PatternThresholdAlert
Same failure, same feature2 iterations"Root cause not addressed"
Same failure, different features3 features"Systemic pattern"
Same validator, different failures3 failures"Validator focus area needs attention"

Retry Guidance

Generate for the re-invoked skill:

code
## Reflexion Guidance - Retry #[N]

### Feedback Summary
| Source | Severity | Issue |
|--------|----------|-------|

### Root Cause
[Category]: [Specific cause]

### Required Corrections
1. [Specific fix with location]

### Pattern Alert
[If applicable]

### Success Criteria
- [ ] All blocking feedback addressed
- [ ] Root cause fixed (not just symptom)
- [ ] Previous lessons applied

Integration with Forge

Trigger: forge_iteration_return with ITERATE verdict

Flow: Roundtable ITERATE → forge_iteration_return → reflexion skill → analyze + store + check patterns + generate guidance → return to autonomous-roundtable → re-select and re-invoke skill


Escalation

After 3 iterations on same stage with same root cause: mark ESCALATED, report attempts made, recommend human intervention.


Example

<example> Feedback: Hermit flags "No input validation on API endpoint"
  1. Parse: source=Hermit, severity=blocking, stage=IMPLEMENT
  2. Categorize: Quality Shortcut (missing validation)
  3. Root cause: Rushed implementation, skipped security checklist
  4. Store reflection with status=PENDING
  5. Pattern check: Hermit flagged validation 2x before → alert
  6. Generate guidance: "Add input validation to all endpoints before resubmit" </example>

<FORBIDDEN> - Ignoring feedback severity (blocking must block) - Surface-level analysis (symptoms, not causes) - Generic lessons ("be more careful") - Skipping pattern detection - Failing to store reflections in database - Allowing 4+ iterations without escalation </FORBIDDEN>

Self-Check

  • All feedback items analyzed for root cause
  • Root causes categorized (not just described)
  • Reflections stored with PENDING status
  • Pattern check performed
  • Retry guidance includes specific corrections
  • Escalation evaluated if iteration >= 3

If ANY unchecked: complete before returning.


<FINAL_EMPHASIS> Failure is information. The roundtable said ITERATE because something was wrong. Your job is to understand WHY, not just WHAT. Store the lesson. Check for patterns. Guide the retry. The same mistake twice is repetition, not learning. </FINAL_EMPHASIS>