Content Evaluation Framework
This skill provides a comprehensive, systematic rubric for evaluating educational book chapters and lessons with quantifiable quality standards.
Constitution Alignment: v4.0.1 emphasizing:
- •Principle 1: Specification Primacy ("Specs Are the New Syntax")
- •Section IIa: 4-Layer Teaching Method (Foundation → Application → Integration → Innovation)
- •Section IIb: AI Three Roles Framework (bidirectional co-learning)
- •8 Foundational Principles: Including Factual Accuracy, Coherent Structure, Progressive Complexity
- •Nine Pillars (Section I): AI CLI, Markdown, MCP, AI-First IDEs, Cross-Platform, TDD, SDD, Composable Skills, Cloud-Native
Purpose
Evaluate educational content across 6 weighted categories to ensure:
- •Technical correctness and code quality
- •Effective pedagogical design and learning outcomes
- •Clear, accessible writing for target audience
- •Proper structure and organization
- •AI-augmented learning principles (learning WITH AI, not generating FROM AI)
- •Constitution compliance and standards adherence
When to Use This Skill
Invoke this evaluation framework at multiple checkpoints:
- •During Iterative Drafting - Mid-process quality checks to catch issues early
- •After Lesson/Chapter Completion - Comprehensive evaluation before moving to next content unit
- •On-Demand Review Requests - When user explicitly asks for quality assessment
- •Before Validation Phase - Part of the SDD Validate phase workflow for final sign-off
Evaluation Methodology
Scoring System
Multi-Tier Assessment:
- •Excellent (90-100%) - Exceeds standards, exemplary quality
- •Good (75-89%) - Meets all standards with minor improvements possible
- •Needs Work (50-74%) - Meets some standards but requires significant revision
- •Insufficient (<50%) - Does not meet minimum standards, requires major rework
Weighted Categories
The evaluation uses 6 categories with the following weights:
| Category | Weight | Focus Area |
|---|---|---|
| Technical Accuracy | 30% | Code correctness, type hints, explanations, examples work as stated |
| Pedagogical Effectiveness | 25% | Show-then-explain pattern, progressive complexity, quality exercises |
| Writing Quality | 20% | Readability (Flesch-Kincaid 8-10), voice, clarity, grade-level appropriateness |
| Structure & Organization | 15% | Learning objectives met, logical flow, appropriate length, transitions |
| AI-First Teaching | 10% | Co-learning partnership demonstrated, Three Roles Framework shown, Nine Pillars aligned, Specs-As-Syntax emphasized |
| Constitution Compliance | Pass/Fail | Must pass all non-negotiable constitutional requirements including Nine Pillars alignment (gate) |
Total Weighted Score Calculation:
Final Score = (Technical × 0.30) + (Pedagogical × 0.25) + (Writing × 0.20) +
(Structure × 0.15) + (AI-First × 0.10)
Constitution Compliance: Must achieve "Pass" status. If "Fail," content cannot proceed regardless of weighted score.
How to Conduct an Evaluation
Step 1: Prepare Context
Before evaluation, gather:
- •Content being evaluated (lesson.md, chapter.md, or section file)
- •Relevant spec, plan, and tasks files from
specs/<feature>/ - •Constitution file (
.specify/memory/constitution.md) - •Learning objectives and success criteria for the content unit
- •Output style template used (
.claude/output-styles/lesson.mdor similar)
Step 2: Load Detailed Rubric
Read the detailed tier criteria for each category:
Read: references/rubric-details.md
This file contains specific criteria defining Excellent/Good/Needs Work/Insufficient for each of the 6 categories.
Step 3: Evaluate Constitution Compliance First
Constitution compliance is a gate - if content fails constitutional requirements, it cannot proceed.
Use the constitution checklist:
Read: references/constitution-checklist.md
Assess all non-negotiable principles and requirements. Mark as Pass or Fail with specific violations noted.
If Constitution Compliance = Fail: Stop evaluation and report violations immediately. Content must be revised before proceeding.
If Constitution Compliance = Pass: Continue to weighted category evaluation.
Step 4: Score Each Weighted Category
For each of the 5 weighted categories (Technical Accuracy, Pedagogical Effectiveness, Writing Quality, Structure & Organization, AI-First Teaching):
- •Review specific criteria from
rubric-details.mdfor that category - •Assess content against criteria for each tier
- •Assign tier (Excellent/Good/Needs Work/Insufficient) with score range
- •Record specific evidence - Quote examples, note line numbers, cite specific passages
- •Provide improvement recommendations - Concrete, actionable feedback
Step 5: Calculate Weighted Score
Apply the weighted formula:
Final Score = (Technical × 0.30) + (Pedagogical × 0.25) + (Writing × 0.20) +
(Structure × 0.15) + (AI-First × 0.10)
Convert tier scores to numeric values:
- •Excellent: 95%
- •Good: 82%
- •Needs Work: 62%
- •Insufficient: 40%
(Or use specific numeric score within tier range if warranted)
Step 6: Generate Evaluation Report
Use the structured evaluation template:
Read: references/evaluation-template.md
Complete all sections:
- •Executive Summary - Overall score, tier, pass/fail status
- •Category Scores - Table showing each category score, tier, and weight contribution
- •Detailed Findings - Evidence-based assessment for each category
- •Strengths - What the content does well (specific examples)
- •Areas for Improvement - Prioritized list of issues with recommendations
- •Constitution Compliance Status - Pass/Fail with specific principle checks
- •Actionable Next Steps - Concrete tasks to improve content
Step 7: Communicate Results
Present evaluation report with:
- •Clear verdict - Pass/Fail and overall quality tier
- •Evidence-based feedback - Specific quotes and line numbers
- •Prioritized improvements - Most critical issues first
- •Encouragement - Acknowledge strengths and effort
Evaluation Best Practices
Be Objective and Evidence-Based
- •Quote specific passages from content being evaluated
- •Reference line numbers or section headers
- •Compare against objective rubric criteria, not subjective preference
- •Use concrete metrics where possible (word count, readability scores, etc.)
Focus on Standards, Not Perfection
- •Content rated "Good" (75-89%) is publication-ready with minor polish
- •Content rated "Excellent" (90-100%) exceeds standards but is not required
- •Focus improvements on moving "Needs Work" → "Good" before "Good" → "Excellent"
Provide Actionable Feedback
- •Don't just say "improve clarity" - specify which sentences are unclear and suggest rewrites
- •Don't just say "add examples" - suggest specific example types that would help
- •Prioritize recommendations: critical (blocking issues) → important → nice-to-have
Respect the Learning Journey
- •Recognize iterative improvement - drafts evolve through multiple passes
- •Celebrate progress and strengths
- •Frame criticism constructively as opportunities for growth
- •Remember: the goal is helping create excellent educational content, not gatekeeping
Quality Gates and Thresholds
Minimum Acceptance Threshold
- •Constitution Compliance: MUST be Pass (gate)
- •Overall Weighted Score: MUST be ≥ 75% (Good or better)
- •No category below 50%: Each individual category must achieve at least "Needs Work" tier
Recommended for Publication
- •Constitution Compliance: Pass
- •Overall Weighted Score: ≥ 82% (Good tier)
- •Technical Accuracy: ≥ 75% (Good tier) - Critical for credibility
- •Pedagogical Effectiveness: ≥ 75% (Good tier) - Critical for learning outcomes
Exemplary Content (Optional)
- •Overall Weighted Score: ≥ 90% (Excellent tier)
- •At least 3 categories at Excellent tier
- •No categories below Good tier
Common Evaluation Scenarios
Scenario 1: Mid-Draft Check (Iterative)
Context: Writer requests feedback on partial draft Approach:
- •Focus on foundational issues (structure, learning objectives, concept scaffolding)
- •Flag critical issues early (technical errors, constitution violations)
- •Provide guidance for remaining sections
- •Don't expect polish - prioritize content completeness and correctness
Scenario 2: Completion Review
Context: Writer believes content is complete and ready for validation Approach:
- •Conduct full evaluation across all 6 categories
- •Calculate final weighted score
- •Check all quality gates and thresholds
- •Provide comprehensive report with prioritized improvements
- •Determine if content meets publication standards
Scenario 3: Pre-Validation Quality Gate
Context: Content enters SDD Validate phase Approach:
- •Verify constitution compliance (gate)
- •Confirm minimum acceptance threshold (≥75%)
- •Validate all category scores meet minimums
- •Generate pass/fail recommendation with evidence
- •If fails gate: return to implementation with specific revision tasks
Scenario 4: On-Demand Spot Check
Context: User asks "How's this looking?" for specific section Approach:
- •Evaluate relevant categories for that section (may not be all 6)
- •Provide quick feedback on specific concerns
- •Highlight any critical issues
- •Suggest improvements without full formal report
- •Use judgment on depth based on context
Resources and References
This skill includes detailed reference materials:
- •
references/rubric-details.md- Comprehensive tier criteria for all 6 categories with specific indicators - •
references/constitution-checklist.md- Pass/Fail checklist for constitutional compliance evaluation - •
references/evaluation-template.md- Structured template for consistent evaluation reports
Load these references as needed during evaluation to ensure consistency and thoroughness.
Example Evaluation Flow
User Request: "Please evaluate this lesson draft: book-source/docs/chapter-3/lesson-2.md"
Evaluation Process:
- •Read content:
book-source/docs/chapter-3/lesson-2.md - •Load context: spec, plan, constitution, learning objectives
- •Check constitution compliance:
references/constitution-checklist.md- •Result: Pass (all non-negotiables met)
- •Load detailed rubric:
references/rubric-details.md - •Evaluate each category:
- •Technical Accuracy: Good (80%) - Code works, minor type hint gaps
- •Pedagogical Effectiveness: Excellent (92%) - Strong scaffolding, great exercises
- •Writing Quality: Good (78%) - Clear writing, minor readability improvements
- •Structure & Organization: Good (85%) - Good flow, all LOs met
- •AI-First Teaching: Needs Work (65%) - AI exercises present but weak guidance
- •Calculate weighted score:
- •(80×0.30) + (92×0.25) + (78×0.20) + (85×0.15) + (65×0.10) = 81.55%
- •Final Tier: Good (81.55%)
- •Load template:
references/evaluation-template.md - •Generate report with findings, strengths, improvements, next steps
- •Communicate verdict: "Good (81.55%) - Ready for publication with minor improvements to AI-First Teaching section"
Use this skill to maintain consistent, objective, evidence-based quality standards for all educational content.