TDD: Test-Driven Development Methodology
Execute tasks using strict TDD (RED → GREEN → REFACTOR). Outcome: Tasks completed with Happy/Failure tests passing, minimal code shipped.
Iron Law
code
NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
Wrote code before the test? Delete it. Start over. Don't keep it as "reference." Don't "adapt" it. Delete means delete. Implement fresh from tests.
Rules
- •2 tests per Test Opportunity (TO): 1 Happy path, 1 Failure path — then stop
- •Scoped execution: Never run repo-wide tests; use
--testPathPattern,--findRelatedTests, or per-file lint - •YAGNI: No abstractions unless test forces it or ≥2 call sites exist
- •Anti-flake: Use fake timers, stubs, seeded RNG
Step 1 - Generate TDD TODO List
- •Action — ParseTaskList: Extract tasks from ARGUMENTS or thread context
- •If no clear tasks → stop and ask for guidance
- •Action — IdentifyTestOpportunities: Derive TOs (smallest behavior unit: function, route, bug fix, acceptance criterion)
- •Action — TransformToTDD: Convert each TO to cycle using TodoWrite:
- •
RED: Happy — {test}→RED: Failure — {test}→GREEN: Minimal impl→REFACTOR: Tidy→COMMIT
- •
- •Action — VerifyScope: Confirm TODO contains ONLY assigned tasks
Step 2 - RED Phase: Write Failing Tests
- •Action — WriteHappyTest: Write first failing test (happy path)
- •Execute only this test/file, not entire suite
- •Action — WriteFailureTest: Write second failing test (primary failure mode)
- •Action — VerifyRed: MANDATORY — Confirm each test:
- •Fails (not errors)
- •Fails for expected reason (feature missing, not typo)
- •If passes → you're testing existing behavior; fix test
Step 3 - GREEN Phase: Minimal Implementation
- •Action — ImplementMinimal: Write least code to pass tests
- •No extra branches, params, or dependencies unless test forces them
- •Action — VerifyGreen: MANDATORY — Run tests (narrowest scope)
- •If fail → fix code, not test
- •Remove any speculative code not forced by tests
Step 4 - REFACTOR Phase: Clean Code
- •Action — RefactorSafely: Improve only if duplication ≥3 OR readability materially improves
- •Keep tests green; If tests fail → revert
- •Action — HandleLintFailures: Apply in order until clear:
- •Guard clauses, split compound expressions
- •Extract tiny private helpers (same file)
- •Hoist literals to file constants
- •Split into orchestrator + helpers
- •Only if still failing: same-directory helper module
Step 5 - Loop or Complete
- •If more TOs → return to Step 2
- •Else → proceed to Step 6
Step 6 - Commit & Report
- •Action — CommitCode: Conventional format (
feat({task}): description) - •Action — GenerateReport:
- •Summary: Tasks completed, test status (✅ Happy ✅ Failure), files modified
- •Artifacts: Test helpers, mocks, fixtures created
- •API Surface: New/modified exports with signatures
- •Patterns: Code/testing patterns to follow
- •Deferred: Coverage gaps for follow-up
Red Flags — STOP and Restart
If any of these occur, delete code and start over with TDD:
| Red Flag | Why It's Wrong |
|---|---|
| Code written before test | Violates Iron Law |
| Test passes immediately | Testing existing behavior, not new |
| Can't explain why test failed | Don't understand what you're testing |
| "Just this once" thinking | Rationalization — TDD has no exceptions |
| Keeping code "as reference" | You'll adapt it; that's tests-after |
When Stuck
| Problem | Solution |
|---|---|
| Don't know how to test | Write wished-for API first, then assert on it |
| Test too complicated | Design too complicated — simplify interface |
| Must mock everything | Code too coupled — use dependency injection |
| Test setup huge | Extract helpers; still complex? Simplify design |
Pre-Completion Checklist
Before marking complete, verify:
- • Every new function has a test
- • Watched each test fail before implementing
- • Each failure was for expected reason
- • Wrote minimal code to pass
- • All tests pass, output clean
- • Mocks used only when unavoidable