/review-tdd-plan Skill
Review TDD runbooks to ensure proper RED/GREEN discipline and non-prescriptive implementation guidance.
Purpose
Critical anti-pattern: Runbooks that prescribe exact implementation code violate TDD principles. Tests should drive implementation, not scripts.
What this skill does:
- •Detects prescriptive implementation code in GREEN phases
- •Identifies RED/GREEN sequencing violations
- •Checks for proper minimal implementation guidance
- •Validates stop conditions and error handling
Target users: T2 agents (sonnet for review)
Review Criteria
1. GREEN Phase Anti-Pattern (CRITICAL)
Violation: GREEN phase contains implementation code
**GREEN Phase:**
**Implementation**: Add load_config() function
```python
def load_config(config_path: Path) -> dict:
with open(config_path, 'r', encoding='utf-8') as f:
config = yaml.safe_load(f)
return config
**Why this is wrong:** - Prescribes exact code - Agent becomes code copier, not implementer - Test doesn't drive implementation - Violates TDD RED→GREEN discovery **Correct approach:** ```markdown **GREEN Phase:** **Implementation**: Minimal load_config() to pass tests **Behavior:** - Read YAML file from config_path - Return parsed dict - Must pass test assertions from RED phase **Hint**: Use yaml.safe_load(), Path.open()
2. RED/GREEN Sequencing
Check: Will test actually fail in RED phase before GREEN implementation?
Common violations:
- •Complete function signatures in first cycle (tests pass immediately)
- •All parameters/features added at once (no incremental RED→GREEN)
- •Error handling in "minimal" implementation (should be separate cycle)
Correct pattern:
- •Cycle X.1: Minimal happy path only
- •Cycle X.2: Add error handling
- •Cycle X.3: Add optional parameters
- •Cycle X.4: Add validation modes
3. Implementation Hints vs Prescription
Acceptable: Implementation hints for sequencing
**Implementation Hint**: Happy path only - no error handling, no validation
Violation: Prescriptive code blocks
# This tells agent EXACTLY what to write
def compose(fragments, output):
# Implementation here
4. Test Specifications
Must have:
- •Specific test name
- •Expected failure message
- •File location
- •Why it will fail
Good example:
**Expected failure:**
ImportError: cannot import name 'compose' from 'claudeutils.compose'
**Why it fails**: Function doesn't exist yet
5. Weak RED Phase Assertions (CRITICAL)
Violation: RED test only verifies structure, not behavior
Indicators:
- •Test only checks
exit_code == 0orexit_code != 0 - •Test only checks key existence (
"KEY" in dict) without value verification - •Test only checks class/method existence (would pass with
passbody) - •Test has no mocking for I/O-dependent behavior
Check: For each RED phase, ask: "Would a stub that returns a constant/empty value pass this test?" If yes → VIOLATION: weak assertion
Correct pattern:
- •Assert on output content, mock interactions, or computed values
- •Mock external dependencies and verify interaction
- •Use fixtures for filesystem state
6. Metadata Accuracy
Check: Total Steps in Weak Orchestrator Metadata matches actual cycle count
- •Count all
## Cycle X.Y:or### Cycle X.Y:headers - •Compare to metadata value
- •If mismatch → VIOLATION: metadata inaccurate
7. Empty-First Cycle Ordering
Warning: First cycle in a phase tests empty/degenerate case
Check: Does Cycle X.1 test empty input, no-op, or missing data?
- •If empty case requires special handling → acceptable
- •If empty case arises naturally from list processing → WARNING: reorder to test simplest happy path first
8. File Reference Validation (CRITICAL)
Violation: Runbook references file paths that don't exist in the codebase
Check: Extract all file paths from:
- •Common Context "Project Paths" section
- •RED phase "Verify RED" commands (e.g.,
pytest tests/test_foo.py) - •GREEN phase "Changes" file references
- •GREEN phase "Verify GREEN" and "Verify no regression" commands
For each path: Use Glob to verify the file exists. If not found, try fuzzy match (e.g., tests/test_account*.py when tests/test_account.py is referenced).
Classification:
- •File doesn't exist and no similar file found → CRITICAL: path fabricated
- •File doesn't exist but similar files found → CRITICAL: wrong path (suggest correct files)
- •Test function referenced doesn't exist in target file → CRITICAL: function not found (use Grep to locate)
Why this matters: Runbooks with wrong file paths fail immediately at execution. This is a complete blocker, not a quality issue.
Review Process
Phase 1: Scan for Code Blocks
Use Grep tool to find code blocks in GREEN phases:
Pattern: ^\*\*GREEN Phase:\*\*
Context: Use -A 20 to see 20 lines after each match
Secondary check: Look for ```python markers in context
For each code block found:
- •Check if it's implementation code (not test code)
- •Check if it's in GREEN phase
- •Mark as VIOLATION
Phase 2: Validate File References
Extract all file paths referenced in the runbook (Common Context, RED/GREEN phases, verify commands). For each path:
- •Use Glob to check existence
- •If not found, search for similar files (e.g.,
tests/test_account*.py) - •Use Grep to verify referenced test functions exist in their target files
- •Mark missing paths as CRITICAL violations with suggested corrections
This phase catches runbooks generated with stale or assumed file paths.
Phase 3: Analyze Cycles
For each cycle (X.Y):
- •Extract RED phase: What test assertions?
- •Extract GREEN phase: What implementation guidance?
- •Check sequence: Will RED fail before GREEN?
Sequencing check:
- •If cycle X.1 includes complete signature + error handling → VIOLATION
- •If cycle X.1 includes all features → VIOLATION
- •If "minimal implementation" has 10+ lines → WARNING
Phase 4: Generate Report
Structure:
# TDD Runbook Review: {name}
## Summary
- Total cycles: N
- Violations found: N critical, N warnings
- Overall assessment: PASS | NEEDS REVISION
## Critical Issues
### Issue 1: GREEN phase contains implementation code
**Location**: Cycle 2.1, line 490-528
**Problem**: Complete load_config() implementation prescribed
**Recommendation**: Replace with behavior description
### Issue 2: RED/GREEN sequencing violation
**Location**: Cycle 3.1
**Problem**: "Minimal" implementation includes all features
**Recommendation**: Split into incremental cycles
## Warnings
### Warning 1: Large "minimal" implementation
**Location**: Cycle 4.1
**Details**: 50 lines in first cycle suggests not minimal
## Recommendations
1. **Remove implementation code from GREEN phases**
- Describe behavior, not code
- Provide hints for approach, not exact solution
2. **Restructure cycles for proper RED→GREEN**
- Cycle X.1: Happy path only
- Cycle X.2: Error handling
- Cycle X.3: Optional features
3. **Add implementation hints where sequencing matters**
- "Happy path only - no error handling"
- "Hardcoded separator '\n---\n\n'"
Output Format
Report file: plans/<feature-name>/reports/runbook-review.md
Return to caller: Summary + path to report
Review complete: 6 violations found **Critical:** - 4 cycles contain implementation code - 2 cycles violate RED/GREEN sequence **Report**: plans/unification/consolidation/reports/runbook-review.md **Recommendation**: Apply fixes before execution
Invocation
Automatic: /plan-tdd Phase 5 delegates to tdd-plan-reviewer agent Manual: Delegate to tdd-plan-reviewer agent with runbook path
Integration
Workflow:
/design (TDD mode) → /plan-tdd → tdd-plan-reviewer agent → [apply fixes] → prepare-runbook.py → /orchestrate
Automatic review: /plan-tdd Phase 5 triggers tdd-plan-reviewer agent automatically
Key Principles
- •Tests drive implementation - Not scripts
- •Minimal means minimal - One behavior per cycle
- •RED must fail - Before GREEN can pass
- •Describe behavior - Not code
- •Provide hints - Not solutions
Created: 2026-01-26 Purpose: Prevent prescriptive TDD runbooks, ensure proper RED/GREEN discipline