Memory Evolution

Iron Law

EVOLVE ONLY WITH EVIDENCE.

Every evolution action (refine, deprecate, merge, archive, induce) must be justified by:

•Q-value data (usage + outcome signals)
•Usage history (success/failure patterns)
•Explicit feedback signals

Never evolve based on gut feeling or speculation.

Trigger Conditions

This skill is activated when:

•Scheduled: Every 6 hours (configurable)
•Q-update threshold: 100+ Q-value updates accumulated
•Process threshold: 20+ new Process nodes created
•Manual: User runs /evolve command

Evolution Workflow

Step 1: Load Supporting Skills

Before taking action, load these meta skills for guidance:

code

skill_search("skill-editing")   → How to edit/deprecate
skill_search("skill-creation")  → How to induce new skills
skill_search("learning-from-experience") → Feedback patterns

Step 2: Query Evolution Candidates

Use query_evolution_candidates tool to find memories needing attention:

code

Candidates by Priority:
├── CRITICAL: Q < 0.2, usage >= 10
│   └── Action: Likely deprecation (consistent failure)
├── HIGH: Q < 0.3, usage >= 5
│   └── Action: Consider refinement (fixable issues)
├── MEDIUM: 90+ days stale
│   └── Action: Consider archival (no recent relevance)
└── LOW: Process clusters without skills
    └── Action: Consider skill induction

Query with different criteria:

•query_evolution_candidates(criteria="low_q") → Low Q-value nodes
•query_evolution_candidates(criteria="stale") → Unused for 90+ days
•query_evolution_candidates(criteria="unlinked_processes") → Processes without skills
•query_evolution_candidates(criteria="similar_pairs") → Potential merge candidates

Step 3: Evaluate Each Candidate

For each candidate, use get_usage_history to understand patterns:

code

get_usage_history(node_id, limit=10)
→ Returns: [{timestamp, outcome, context}, ...]

Decision matrix:

Q-value	Usage Count	Pattern	Action
< 0.2	>= 10	Consistent failure	deprecate
< 0.3	>= 5	Identifiable fix	refine
< 0.3	>= 5	No clear fix	Keep, monitor
any	>= 5	Similarity > 0.95	merge
any	0 for 90+ days	No recent queries	archive
N/A	3+ Processes	Similar triggers	induce skill

Step 4: Execute Actions

Deprecate (mark as obsolete)

code

deprecate_node(
    node_id="fact_xxx",
    node_type="Fact",
    reason="Low Q-value (0.15) with 12 uses, consistent failure pattern"
)

•Node will no longer appear in retrieval
•Provenance chain preserved for auditing

Refine (create improved version)

code

refine_node(
    node_id="process_xxx",
    node_type="Process",
    improved_content="...",  # New trigger/action/outcome
    reason="Clarified trigger condition based on failure patterns"
)

•Creates new node with SUPERSEDES relationship
•Old node marked as superseded
•New node starts with Q=0.5

Merge (combine redundant nodes)

code

merge_nodes(
    node_ids=["fact_001", "fact_002"],
    node_type="Fact",
    merged_content="Combined fact with complete information"
)

•Keeps node with highest Q-value
•Merges provenance chains
•Others marked MERGED_INTO

Archive (mark as historical)

code

archive_node(
    node_id="fact_xxx",
    node_type="Fact"
)

•Node marked as archived with timestamp
•Still accessible but deprioritized in retrieval

Induce Skill (from Process cluster)

code

induce_skill(
    process_ids=["proc_001", "proc_002", "proc_003"],
    skill_name="debug-memory-leak",
    description="Use when encountering OOM errors",
    content="# Steps\n1. Capture heap dump\n..."
)

•Requires 3+ similar Processes (follow skill-creation guidelines)
•Creates INSTANCE_OF relationships
•Skill written to file system

Step 5: Report Results

Return structured summary:

json

{
    "candidates_evaluated": 15,
    "actions_taken": {
        "deprecated": 2,
        "refined": 3,
        "merged": 1,
        "archived": 5,
        "induced_skills": 1
    },
    "details": [
        {"node_id": "fact_xxx", "action": "deprecated", "reason": "..."},
        {"node_id": "process_yyy", "action": "refined", "new_id": "process_zzz", "reason": "..."}
    ],
    "summary": "Deprecated 2 low-quality facts, refined 3 processes with fixable issues, induced 1 skill from OOM debugging patterns"
}

Evidence Requirements

Action	Minimum Evidence
Deprecate	Q < 0.2, usage >= 10, failure pattern in history
Refine	Q < 0.3, usage >= 5, identifiable improvement path
Merge	Similarity > 0.95, same node type
Archive	No usage for 90+ days
Induce	3+ similar generalizable Processes

Anti-patterns

DO NOT:

•Deprecate without checking usage history
•Refine when fundamental approach is wrong (should deprecate instead)
•Merge semantically different memories just because wording is similar
•Induce skills from < 3 processes (insufficient evidence)
•Archive frequently used memories
•Evolve protected meta skills without explicit justification

Conservative by Default

When in doubt:

•Keep > Deprecate: A marginally useful memory is better than none
•Monitor > Act: Wait for more evidence if uncertain
•Refine > Replace: Improve existing rather than start over

Integration with Q-Learning

Evolution actions feed back into the Q-learning cycle:

code

┌─────────────────────────────────────────────────────┐
│  Low Q-value detected                               │
│      ↓                                              │
│  Evolution evaluates                                │
│      ↓                                              │
│  Action taken (refine/deprecate)                    │
│      ↓                                              │
│  New/updated memory enters retrieval                │
│      ↓                                              │
│  Usage generates new feedback                       │
│      ↓                                              │
│  Q-value updates                                    │
│      ↓                                              │
│  (cycle continues)                                  │
└─────────────────────────────────────────────────────┘

Refined memories get a fresh start (Q=0.5) to prove their value.