ALM Reflection

Name: reflect
Rating: 92
Author: sfw

Analyze accumulated outcomes and corrections, update playbooks, and retrain the TF-IDF classifier.

Instructions

You are orchestrating a reflection cycle. Follow these steps precisely:

Step 1: Load State

Read ~/.claude/alm/state.json and ~/.claude/alm/confidence.json to understand current state.

Step 2: Find Task Types Needing Reflection

Read all .jsonl files in ~/.claude/alm/evaluations/. For each evaluation record, note its taskType. A task type needs reflection if:

•It has 5+ evaluations since the last reflection for that type (check confidence.json[taskType].lastReflection)
•OR it appears in ~/.claude/alm/reflect-queue.json

If $ARGUMENTS specifies a task type, only reflect on that type. If "all" or empty, reflect on all eligible types.

Step 3: For Each Task Type Needing Reflection

•
Gather data: Collect all evaluations for this type. Separate corrections (correctionDetected: true) from successes. Pay special attention to correctionContext fields (before/after pairs) and learning fields.
•
Load current playbook: Check ~/.claude/alm/playbooks/{task-type}.md first. If not found, check ${CLAUDE_PLUGIN_ROOT}/seed-playbooks/{task-type}.md for the seed version.
•
Synthesize updated playbook: Using your analysis capabilities:
- •Identify recurring correction patterns (must appear 3+ times for promotion)
- •Extract concrete anti-patterns from before/after pairs
- •Preserve useful existing advice unless contradicted
- •Keep under 800 tokens
- •Write to ~/.claude/alm/playbooks/{task-type}.md
•
Update confidence: After writing the playbook, update ~/.claude/alm/confidence.json for this type:
- •Set lastReflection to current ISO timestamp
- •Set source to "personalized"
- •Compute correctionRateAtLastReflect from recent evaluations

Step 4: Cross-Type Pattern Detection

After processing individual types, scan ALL corrections across ALL types. If the same pattern appears in 3+ different task types (e.g., "user always wants terse commit messages"), extract it to ~/.claude/alm/playbooks/_global.md. The _global.md playbook is always injected regardless of task classification.

Step 5: Retrain TF-IDF Classifier

Run this command to retrain the classifier from all evaluations:

bash

python3 -c "
import sys, os, json
sys.path.insert(0, os.path.expanduser('${CLAUDE_PLUGIN_ROOT}/scripts'))
from lib.tfidf import TFIDFClassifier

eval_dir = os.path.expanduser('~/.claude/alm/evaluations')
training_data = []
for fname in sorted(os.listdir(eval_dir)):
    if not fname.endswith('.jsonl'):
        continue
    with open(os.path.join(eval_dir, fname)) as f:
        for line in f:
            line = line.strip()
            if not line:
                continue
            try:
                record = json.loads(line)
                text = record.get('promptFingerprint', record.get('promptText', ''))
                task_type = record.get('taskType', '')
                if text and task_type and task_type not in ('trivial', 'unknown'):
                    training_data.append((text, task_type))
            except json.JSONDecodeError:
                continue

if len(training_data) >= 10:
    classifier = TFIDFClassifier()
    classifier.fit(training_data)
    classifier.save(os.path.expanduser('~/.claude/alm/classifier/model.json'))
    print(f'Trained TF-IDF model on {len(training_data)} examples across {len(classifier.task_types)} task types')
else:
    print(f'Only {len(training_data)} training examples — need 10+ to train classifier')
"

Step 6: Cleanup

•Clear ~/.claude/alm/reflect-queue.json (write {})
•Update ~/.claude/alm/state.json: set lastReflection to current ISO timestamp

Step 7: Report

Summarize what changed:

•Which task types were reflected on
•How many evaluations and corrections were analyzed per type
•Key patterns discovered
•Whether the TF-IDF classifier was trained/updated
•Any cross-type patterns extracted to _global.md
•Suggested next actions