ALM Reflection
Analyze accumulated outcomes and corrections, update playbooks, and retrain the TF-IDF classifier.
Instructions
You are orchestrating a reflection cycle. Follow these steps precisely:
Step 1: Load State
Read ~/.claude/alm/state.json and ~/.claude/alm/confidence.json to understand current state.
Step 2: Find Task Types Needing Reflection
Read all .jsonl files in ~/.claude/alm/evaluations/. For each evaluation record, note its taskType. A task type needs reflection if:
- •It has 5+ evaluations since the last reflection for that type (check
confidence.json[taskType].lastReflection) - •OR it appears in
~/.claude/alm/reflect-queue.json
If $ARGUMENTS specifies a task type, only reflect on that type. If "all" or empty, reflect on all eligible types.
Step 3: For Each Task Type Needing Reflection
- •
Gather data: Collect all evaluations for this type. Separate corrections (
correctionDetected: true) from successes. Pay special attention tocorrectionContextfields (before/after pairs) andlearningfields. - •
Load current playbook: Check
~/.claude/alm/playbooks/{task-type}.mdfirst. If not found, check${CLAUDE_PLUGIN_ROOT}/seed-playbooks/{task-type}.mdfor the seed version. - •
Synthesize updated playbook: Using your analysis capabilities:
- •Identify recurring correction patterns (must appear 3+ times for promotion)
- •Extract concrete anti-patterns from before/after pairs
- •Preserve useful existing advice unless contradicted
- •Keep under 800 tokens
- •Write to
~/.claude/alm/playbooks/{task-type}.md
- •
Update confidence: After writing the playbook, update
~/.claude/alm/confidence.jsonfor this type:- •Set
lastReflectionto current ISO timestamp - •Set
sourceto"personalized" - •Compute
correctionRateAtLastReflectfrom recent evaluations
- •Set
Step 4: Cross-Type Pattern Detection
After processing individual types, scan ALL corrections across ALL types. If the same pattern appears in 3+ different task types (e.g., "user always wants terse commit messages"), extract it to ~/.claude/alm/playbooks/_global.md. The _global.md playbook is always injected regardless of task classification.
Step 5: Retrain TF-IDF Classifier
Run this command to retrain the classifier from all evaluations:
python3 -c "
import sys, os, json
sys.path.insert(0, os.path.expanduser('${CLAUDE_PLUGIN_ROOT}/scripts'))
from lib.tfidf import TFIDFClassifier
eval_dir = os.path.expanduser('~/.claude/alm/evaluations')
training_data = []
for fname in sorted(os.listdir(eval_dir)):
if not fname.endswith('.jsonl'):
continue
with open(os.path.join(eval_dir, fname)) as f:
for line in f:
line = line.strip()
if not line:
continue
try:
record = json.loads(line)
text = record.get('promptFingerprint', record.get('promptText', ''))
task_type = record.get('taskType', '')
if text and task_type and task_type not in ('trivial', 'unknown'):
training_data.append((text, task_type))
except json.JSONDecodeError:
continue
if len(training_data) >= 10:
classifier = TFIDFClassifier()
classifier.fit(training_data)
classifier.save(os.path.expanduser('~/.claude/alm/classifier/model.json'))
print(f'Trained TF-IDF model on {len(training_data)} examples across {len(classifier.task_types)} task types')
else:
print(f'Only {len(training_data)} training examples — need 10+ to train classifier')
"
Step 6: Cleanup
- •Clear
~/.claude/alm/reflect-queue.json(write{}) - •Update
~/.claude/alm/state.json: setlastReflectionto current ISO timestamp
Step 7: Report
Summarize what changed:
- •Which task types were reflected on
- •How many evaluations and corrections were analyzed per type
- •Key patterns discovered
- •Whether the TF-IDF classifier was trained/updated
- •Any cross-type patterns extracted to
_global.md - •Suggested next actions