Skill Improver
Process user feedback from skill retrospectives and update skill files to improve them over time.
When to Use
- •User asks to "review skill feedback" or "improve skills based on usage"
- •You notice feedback files in
.claude/feedback/ - •User mentions a skill didn't work well or missed something
- •Periodic review (monthly) to incorporate learnings
How It Works
Step 1: Gather Feedback
Read all feedback files in .claude/feedback/:
ls -la .claude/feedback/retro-*.md
Look for patterns:
- •Multiple users reporting same missing step → add to skill
- •Benchmarks don't match user's context → add context-specific ranges
- •Workflow confusing → restructure or add clarifications
- •Skill incomplete → add missing sections
Step 2: Identify High-Impact Changes
Prioritize updates based on:
High Priority (do first):
- •Missing critical steps that users had to figure out themselves
- •Incorrect benchmarks or numbers
- •Confusing workflow that requires clarification
- •Safety issues or errors
Medium Priority:
- •Additional examples or templates
- •Better explanations of existing steps
- •Alternative approaches for different contexts
Low Priority:
- •Nice-to-have additions
- •Stylistic improvements
- •Minor clarifications
Step 3: Update Skill Files
For each skill needing updates:
3a. Add a "Learnings" Section
If the skill doesn't have one, add at the end:
## Learnings from Use **[Date]**: [Brief description of what was learned] - **Feedback**: [What users reported] - **Update**: [What we changed] - **Result**: [Expected improvement]
3b. Update Main Content
If feedback suggests core changes:
- •Add missing steps to checklists
- •Update benchmarks with ranges (e.g., "20-30% for B2C, 50-70% for B2B")
- •Restructure workflow if confusing
- •Add "Common Pitfalls" section if users make same mistakes
3c. Version the Change
At the top of the skill, track versions:
--- name: skill-name version: 1.2.0 last_updated: 2026-01-22 changelog: - v1.2.0 (2026-01-22): Added missing step for X based on user feedback - v1.1.0 (2026-01-15): Updated benchmarks for Y context - v1.0.0 (2026-01-01): Initial release ---
Step 4: Archive Processed Feedback
Move processed feedback to archive:
mkdir -p .claude/feedback/archive mv .claude/feedback/retro-2026-01-22-*.md .claude/feedback/archive/
Keep a summary of learnings in .claude/feedback/SUMMARY.md:
# Feedback Summary ## [Skill Name] **Total feedback sessions**: 12 **Last updated**: 2026-01-22 **Key learnings**: - Added step for X (reported by 3 users) - Updated benchmarks for B2B context (reported by 5 users) - Clarified workflow around Y (reported by 2 users) **Patterns**: - Users in enterprise context need higher benchmarks - Early-stage startups need more examples - Non-technical users need clearer explanations of jargon
Example Workflow
Scenario: product-market-fit skill needs improvement
Step 1: Review Feedback
Read .claude/feedback/retro-2026-01-22-143022.md:
## Feedback **Missed important steps?** yes **Improvements needed:** The Sean Ellis test threshold of 40% seems high for B2B enterprise products. We're at 32% "very disappointed" but our retention is 85% D30 which is excellent. Should the skill mention that thresholds vary by product type?
Step 2: Identify Pattern
Check other feedback files → 3 more users report B2B context needs different benchmarks.
Step 3: Update Skill
Edit .claude/skills/product-market-fit/SKILL.md:
Before:
## Sean Ellis Test (40% Rule) "How would you feel if you could no longer use [product]?" - ≥40% "Very disappointed" = Strong PMF
After:
## Sean Ellis Test (Context-Dependent Thresholds) "How would you feel if you could no longer use [product]?" **Thresholds by product type:** - **Consumer B2C**: ≥40% "Very disappointed" = Strong PMF - **SMB B2B**: ≥35% "Very disappointed" = Strong PMF - **Enterprise B2B**: ≥30% "Very disappointed" = Strong PMF (longer sales cycles, different buying psychology) **Why the difference?** - Enterprise buyers are more rational than emotional - Switching costs are higher (contracts, integrations) - Retention is a better PMF signal for B2B (see Step 2)
Add to Learnings section:
## Learnings from Use **2026-01-22**: Refined Sean Ellis thresholds by product type - **Feedback**: 4 users reported 40% threshold too high for B2B enterprise - **Update**: Added context-specific thresholds (B2C 40%, SMB 35%, Enterprise 30%) - **Result**: More accurate PMF diagnosis for different product types
Step 4: Archive & Track
mv .claude/feedback/retro-2026-01-22-*.md .claude/feedback/archive/
Update .claude/feedback/SUMMARY.md:
## product-market-fit **Total feedback sessions**: 4 **Last updated**: 2026-01-22 **Key learnings**: - Added context-specific Sean Ellis thresholds (reported by 4 users) - B2B needs different benchmarks than B2C **Next improvements to consider**: - Add industry-specific retention benchmarks - Include examples from different verticals
Quality Checklist
Before updating any skill, ensure:
- • Feedback is from multiple users (pattern, not outlier)
- • Change makes skill more accurate, not just more complex
- • Benchmarks are sourced or validated (not anecdotal)
- • Update is backward compatible (doesn't break existing workflows)
- • Learnings section documents why we made the change
- • Version number incremented appropriately (semver)
- • Processed feedback archived, not deleted
Feedback Categories
Track feedback by type to identify systemic issues:
Category 1: Missing Steps
Example: "Skill forgot to mention we need to segment cohorts by acquisition channel" Action: Add step to checklist
Category 2: Incorrect Benchmarks
Example: "40% D30 retention is not 'strong' for our B2B SaaS, it's average" Action: Update benchmarks with context (B2C vs B2B vs Enterprise)
Category 3: Confusing Workflow
Example: "I didn't know whether to do cohort analysis before or after Sean Ellis test" Action: Number steps clearly, add workflow diagram
Category 4: Missing Context
Example: "Skill assumes I have 1000+ users, what if I only have 50?" Action: Add "Early Stage Adaptation" section
Category 5: Tool-Specific Issues
Example: "How do I calculate D30 retention in Google Analytics?" Action: Add "Implementation in Common Tools" section
Best Practices
Do:
✅ Look for patterns across multiple feedback sessions ✅ Update skills incrementally (small, tested changes) ✅ Document why changes were made (Learnings section) ✅ Preserve feedback history (archive, don't delete) ✅ Version skills so users know what changed
Don't:
❌ Update based on single piece of feedback (might be outlier) ❌ Make skills overly complex trying to cover every edge case ❌ Remove content without understanding why it was there ❌ Ignore feedback for more than 30 days (patterns emerge) ❌ Update without testing the new version
Automation Ideas
Weekly Digest (optional)
Create a script to summarize new feedback:
#!/bin/bash
# .claude/hooks/learning/weekly-feedback-digest.sh
echo "📊 Feedback Digest (Last 7 Days)"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
find .claude/feedback -name "retro-*.md" -mtime -7 | while read -r file; do
echo ""
echo "File: $(basename $file)"
grep "Skills Used" -A 5 "$file"
grep "Improvements needed:" -A 3 "$file"
done
Auto-Tag for Review
When feedback mentions specific issues, auto-tag:
- •"missing step" → tag for immediate review
- •"wrong number" → tag for fact-check
- •"confusing" → tag for clarity rewrite
Success Metrics
Track improvement over time:
- •Feedback frequency: Decreasing = skills getting better
- •Repeated issues: Should approach zero over time
- •User satisfaction: Track "Did this skill help?" responses
- •Skill usage: Updated skills should see increased usage
Meta: This Skill Improves Itself
This skill should follow its own advice:
Learnings from Use:
[To be filled as this skill gets used and improved]
Version History:
- •v1.0.0 (2026-01-22): Initial release - framework for skill improvement
Next Steps:
- •Review feedback in
.claude/feedback/ - •Identify patterns and prioritize updates
- •Update skill files with improvements
- •Document learnings
- •Archive processed feedback
- •Commit changes with clear message
The more you use this system, the better your skills become. It's a continuous improvement loop.