Learning from Experience

Iron Law

EVERY TASK IS A LEARNING OPPORTUNITY.

After completing any non-trivial task, evaluate what was learned and whether it should become a reusable skill.

Post-Task Reflection Process

Step 1: Assess the Task Outcome

After completing a task, ask:

•Was the task successful? → Provide "success" feedback to any skills used
•Did any skills fail? → Provide "failure" feedback
•Was a new approach discovered? → Candidate for process recording

Step 2: Provide Feedback on Used Skills

For EVERY skill that was loaded during the task:

code

skill_feedback("skill-name", "success")  # or "failure" or "partial"

This is critical for Q-value learning. Without feedback, skills cannot improve.

Step 3: Identify New Patterns

Check if the task revealed a reusable pattern:

code

Pattern Recognition Checklist:
├── Did I solve a problem in a new way?
│   └── Record as a Process
├── Did I apply the same approach I've used before?
│   └── Search for similar processes, consider skill creation
├── Did I learn something about a tool or system?
│   └── Record as a Fact/Attribute
├── Did I discover a general principle?
│   └── Record as a Principle
└── Did nothing new happen?
    └── That's fine, not every task produces learning

Step 4: Evaluate for Skill Induction

If you've identified a pattern:

•

Search for similar processes:

code

processes = search_processes("trigger description")

•
Count similar processes:
- •0-1 similar: Too early. Record and wait.
- •2+: Consider skill creation
- •3+: Strong candidate for skill creation
•
Assess generalizability:
- •Can this pattern apply beyond the current project?
- •Would another agent benefit from knowing this?
- •Is it specific enough to be actionable?

Step 5: Create or Update Skills

Based on the evidence:

•New pattern with 2+ examples: Create a new skill
•Existing skill helped: Provide success feedback
•Existing skill was wrong: Provide failure feedback + edit if needed
•Existing skill was incomplete: Edit to add missing steps

Feedback Loop

The learning cycle is:

code

Task → Skills Used → Outcome → Feedback → Q-value Update → Better Skills
  │                                                              │
  └──────── New Patterns → Process Recording → Skill Induction ──┘

This creates a virtuous cycle where:

•Skills improve through feedback
•New skills emerge from accumulated processes
•Bad skills get suspended through low Q-values
•Good skills rise to the top through high Q-values

Quality Signals

Signs the system is learning well:

•Q-values of good skills trending upward
•New skills being created from process clusters
•Low-quality skills being auto-suspended
•Agents finding relevant skills before tasks

Signs the system needs attention:

•No new skills being created despite many tasks
•All skills having similar Q-values (~0.5)
•No feedback being provided (Q-values frozen)
•Skills being created without source processes

Anti-patterns

•Skipping feedback after tasks ("I'll do it later")
•Only providing positive feedback (negativity bias avoidance)
•Creating skills from a single impressive instance
•Not searching for existing skills before creating new ones
•Ignoring auto-suspended skills (they need review, not abandonment)