TDD: Test-Driven Development Methodology

Execute tasks using strict TDD (RED → GREEN → REFACTOR). Outcome: Tasks completed with Happy/Failure tests passing, minimal code shipped.

Iron Law

code

NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST

Wrote code before the test? Delete it. Start over. Don't keep it as "reference." Don't "adapt" it. Delete means delete. Implement fresh from tests.

Rules

•2 tests per Test Opportunity (TO): 1 Happy path, 1 Failure path — then stop
•Scoped execution: Never run repo-wide tests; use --testPathPattern, --findRelatedTests, or per-file lint
•YAGNI: No abstractions unless test forces it or ≥2 call sites exist
•Anti-flake: Use fake timers, stubs, seeded RNG

Step 1 - Generate TDD TODO List

•
Action — ParseTaskList: Extract tasks from ARGUMENTS or thread context
- •If no clear tasks → stop and ask for guidance
•Action — IdentifyTestOpportunities: Derive TOs (smallest behavior unit: function, route, bug fix, acceptance criterion)
•
Action — TransformToTDD: Convert each TO to cycle using TodoWrite:
- •RED: Happy — {test} → RED: Failure — {test} → GREEN: Minimal impl → REFACTOR: Tidy → COMMIT
•Action — VerifyScope: Confirm TODO contains ONLY assigned tasks

Step 2 - RED Phase: Write Failing Tests

•
Action — WriteHappyTest: Write first failing test (happy path)
- •Execute only this test/file, not entire suite
•Action — WriteFailureTest: Write second failing test (primary failure mode)
•
Action — VerifyRed: MANDATORY — Confirm each test:
- •Fails (not errors)
- •Fails for expected reason (feature missing, not typo)
- •If passes → you're testing existing behavior; fix test

Step 3 - GREEN Phase: Minimal Implementation

•
Action — ImplementMinimal: Write least code to pass tests
- •No extra branches, params, or dependencies unless test forces them
•
Action — VerifyGreen: MANDATORY — Run tests (narrowest scope)
- •If fail → fix code, not test
- •Remove any speculative code not forced by tests

Step 4 - REFACTOR Phase: Clean Code

•
Action — RefactorSafely: Improve only if duplication ≥3 OR readability materially improves
- •Keep tests green; If tests fail → revert
•
Action — HandleLintFailures: Apply in order until clear:
1. •Guard clauses, split compound expressions
2. •Extract tiny private helpers (same file)
3. •Hoist literals to file constants
4. •Split into orchestrator + helpers
5. •Only if still failing: same-directory helper module

Step 5 - Loop or Complete

•If more TOs → return to Step 2
•Else → proceed to Step 6

Step 6 - Commit & Report

•Action — CommitCode: Conventional format (feat({task}): description)
•
Action — GenerateReport:
- •Summary: Tasks completed, test status (✅ Happy ✅ Failure), files modified
- •Artifacts: Test helpers, mocks, fixtures created
- •API Surface: New/modified exports with signatures
- •Patterns: Code/testing patterns to follow
- •Deferred: Coverage gaps for follow-up

Red Flags — STOP and Restart

If any of these occur, delete code and start over with TDD:

Red Flag	Why It's Wrong
Code written before test	Violates Iron Law
Test passes immediately	Testing existing behavior, not new
Can't explain why test failed	Don't understand what you're testing
"Just this once" thinking	Rationalization — TDD has no exceptions
Keeping code "as reference"	You'll adapt it; that's tests-after

When Stuck

Problem	Solution
Don't know how to test	Write wished-for API first, then assert on it
Test too complicated	Design too complicated — simplify interface
Must mock everything	Code too coupled — use dependency injection
Test setup huge	Extract helpers; still complex? Simplify design

Pre-Completion Checklist

Before marking complete, verify:

• Every new function has a test
• Watched each test fail before implementing
• Each failure was for expected reason
• Wrote minimal code to pass
• All tests pass, output clean
• Mocks used only when unavoidable