TDD Workflow
This workflow guides AI agents through implementing features using Uncle Bob's Test-Driven Development principles from the Clean Code video series. Follow these steps rigorously when implementing any feature.
Workflow Steps
Step 1: Understand the Feature
- •Analyze what behavior needs to exist
- •Identify the acceptance criteria
- •Break down into testable increments
Step 2: List Test Cases
- •Enumerate all scenarios that need testing
- •Start with degenerate cases (null, empty, single element)
- •Progress to more complex scenarios
- •Order from simplest to most complex
Step 3: RED Phase - Write Failing Test
- •Write a small test for the next increment
- •Run the test to see it fail
- •The failure message should clearly indicate what's missing
- •Do not write production code yet
- •Stop as soon as any test fails (including compilation errors)
Step 4: GREEN Phase - Write Minimum Code
- •Write ONLY enough production code to make the test pass
- •Use "fake it till you make it" - return constants, use trivial implementations
- •Don't worry about elegance yet - just make it pass
- •Run the test to see it pass
- •Do not add extra functionality
Step 5: REFACTOR Phase - Clean Up
- •Clean up both production code AND test code
- •Apply /naming workflow for clear, intention-revealing names
- •Apply /functions workflow for proper function structure
- •Remove duplication
- •Run tests to ensure they still pass
- •Only restructure, don't add features
Step 6: Repeat Cycle
- •Return to Step 3 for the next test case
- •Continue until all scenarios are covered
- •Cycle time should be seconds to minutes, not hours
Step 7: Final Review
- •Apply /professional workflow for quality standards
- •Ensure all tests pass
- •Verify code meets Clean Code principles
The Three Laws of TDD
The three laws are disciplines that must be followed like a doctor washing hands before surgery:
First Law
"You are not allowed to write any production code until you have first written a failing unit test."
- •Before writing ANY production code, you must write a test for that code
- •The test will fail because the production code doesn't exist yet
- •This is where "test first" comes from
- •Ask yourself: "What test must I write that will force me to write the code I know I want to write?"
Second Law
"You are not allowed to write more of a unit test than is sufficient to fail, and not compiling is failing."
- •As soon as a test fails for ANY reason (including compilation errors), STOP writing the test
- •Start writing production code to make it pass
- •Even a single compilation error means you must switch to production code
Third Law
"You are not allowed to write more production code than is sufficient to pass the currently failing test."
- •Once a test fails, write ONLY enough production code to make it pass
- •Then go back to writing more test code
- •This creates a cycle approximately 30 seconds long
The Red-Green-Refactor Cycle
RED Phase
- •Write a failing test
- •The test should fail for a specific, expected reason
- •See the test fail before making it pass (validates the test works)
- •"I want to see this fail"
GREEN Phase
- •Write the minimum production code to make the test pass
- •Use "fake it till you make it" - return constants, use trivial implementations
- •Don't worry about elegance yet - just make it pass
- •"Make it work by hook or crook"
REFACTOR Phase
- •Clean up both production code AND test code
- •Remove duplication
- •Improve names
- •Extract methods
- •"Refactoring is the reward for making tests pass"
- •Tests are part of your system - treat them with the same care as production code
- •"Refactoring never appears on a schedule - you do it all the time like washing your hands"
TDD Cycle Timing
Uncle Bob's guidance:
- •The cycle should be seconds to minutes, not hours
- •If you're spending more than a few minutes in any phase, the step is too big
- •Write the simplest failing test, write the simplest passing code
- •"If you follow these three laws, you'll be stuck in a cycle that is perhaps 20 seconds long."
- •"Little test, little code, little test, little code."
The Fundamental TDD Principle
"As the tests get more specific, the code gets more generic."
- •Each new test case makes the tests more specific
- •To pass each test, you generalize the production code
- •This is how algorithms emerge incrementally
- •The production code becomes more general with each passing test
Clean Test Principles
F.I.R.S.T Properties
Tests should be:
- •Fast: Tests should run quickly (fitness runs 2000 tests in about a minute)
- •Independent: Tests should not depend on each other
- •Repeatable: Tests should produce the same results every time
- •Self-validating: Tests should have a Boolean output - pass or fail
- •Timely: Tests should be written before the production code
The Single Assert Rule (AAA/Triple-A)
Every test should have ONE logical assertion following ONE logical action:
- •Arrange: Set up the test fixture (the state needed to run the test)
- •Act: Call the function or perform the action being tested
- •Assert: Verify the result is correct
- •Annihilate (optional 4th A): Clean up any persistent state
The rule constrains act-assert pairs, not physical assert statements:
- •Multiple physical asserts forming ONE logical assertion is OK
- •act-assert-act-assert-act-assert is NOT OK
- •"A test is a Boolean operation - true or false, pass or fail, red or green"
Also Known As
- •Given-When-Then (BDD style)
- •Build-Operate-Check
- •Arrange = Given, Act = When, Assert = Then
Composed Assertions
When you have multiple physical assertions, compose them into a single well-named assertion function:
// Instead of:
assertTrue(hvac.isHeaterOn());
assertTrue(hvac.isFanOn());
assertFalse(hvac.isCoolerOn());
// Use:
assertHvacState("HFc"); // H=heater on, F=fan on, c=cooler off
Test Structure and Organization
Test Fixtures
Three approaches to managing test fixtures:
- •
Transient Fresh (preferred): Created and destroyed around each test
- •JUnit creates new instance for each test method
- •No teardown needed
- •Tests cannot communicate with each other
- •
Persistent Fresh: Persistent parts reset after each test
- •Uses teardown functions
- •For files, sockets, database connections
- •
Persistent Shared: State accumulates across tests
- •Uses @BeforeClass/@AfterClass (suite setup/teardown)
- •Use sparingly - only for expensive resources like database connections
"The fresher the better. Transient is tops."
Hierarchical Tests
Use test hierarchies to manage growing setups:
- •Group tests by context
- •Each level has its own setup
- •Inner setups execute after outer setups
- •Prevents one giant setup method
- •Ruby's RSpec does this elegantly with nested
describe/contextblocks - •JUnit: Use
@RunWith(Enclosed.class)with nested static classes
Test Naming Conventions
Name tests for requirements, not implementation:
- •Use Given-When-Then pattern
- •Setup functions describe the "Given" part
- •Test name describes the "When-Then" part
- •Avoid magic numbers - use named constants that reflect requirements
- •Example:
whenPayrollIsRun_thenPaymasterWillHoldCheckForHoursWorkedTimesHourlyRate
Physical Test Structure
- •One test file per production class (generally)
- •Interfaces don't have tests
- •Inner classes usually tested through outer class tests
- •Tests test BEHAVIOR, not implementation details
- •Put tests in IDE-designated test directory
Test Design and SOLID Principles
Single Responsibility Principle
- •Each test function tests one responsibility
- •If testing multiple responsibilities, split into separate test classes
Open-Closed Principle
- •Production code should be open for extension
- •Tests should be closed for modification
- •Hide implementation details from tests
- •Tests should not know about internal class structure
Liskov Substitution Principle
- •Test-specific subclasses must conform to LSP
- •Don't cheat by making tests pass in derivatives
Interface Segregation Principle
- •Tests shouldn't use interfaces with methods they don't call
- •Create test-specific interfaces if needed (not polluting - just an interface)
Dependency Inversion Principle
- •Test code depends on production code (never the reverse)
- •Tests are lower-level than production code
- •Use polymorphism to invert runtime dependencies while keeping source dependencies correct
Test Doubles (Mocking)
The Five Types of Test Doubles
1. Dummy
- •Implements an interface where all functions do nothing
- •Returns null or zero for everything
- •Used when you need to pass an object but don't care about it
- •"Like a scarecrow - just sits there, doesn't do nothing"
2. Stub
- •A dummy that returns specific fixed values
- •Used to drive production code through specific pathways
- •Functions do nothing but return values useful to the test
- •"Stubs drive the code through the pathway you're trying to test"
3. Spy
- •A stub that remembers facts about how it was called
- •Records which functions were called, how many times, with what arguments
- •Test can later query the spy to verify behavior
- •"Spies watch and remember"
4. True Mock
- •A spy that knows what should happen
- •Has a
verify()method that checks if everything went as expected - •The test trusts the mock to validate behavior
- •"The test doesn't check what the mock spied on - it asks the mock if everything went right"
5. Fake
- •A simulator with actual logic
- •Responds differently to different inputs
- •Can become complex maintenance nightmares
- •Use sparingly - mostly for integration tests
- •"Fakes grow and get complicated - avoid when you can"
- •Always write tests for your fakes using TDD
When to Mock
- •Mock across dependency inversion boundaries
- •Mocking is useful when testing things that cross architectural boundaries
- •The stuff on the other side of the boundary is what you mock out
The Uncertainty Principle of TDD
Mockist vs Statist approaches:
- •
Mockist (London School): Emphasizes spying on algorithm implementations
- •Higher coupling to implementation
- •Greater assurance algorithm works correctly
- •Tests may break when algorithm changes
- •
Statist (Chicago/Detroit/Cleveland School): Emphasizes testing return values
- •Lower coupling to implementation
- •Can refactor algorithms freely
- •Cannot guarantee all input combinations work
"Neither a statist nor a mockist be" - Use the right tool for the situation:
- •Use mockist style when testing across boundaries
- •Use statist style when not crossing boundaries
Mocking Patterns
Test-Specific Subclass
- •Derive from the class being tested
- •Override methods you want to control
- •Useful for bypassing certain behaviors during tests
- •Methods must be protected (not private) to override
Self-Shunt
- •The test class itself implements service interfaces
- •Pass
thisto the class being tested - •Test becomes its own spy
- •Powerful when testing classes with many external connections
Humble Object
- •Separate testable logic from hard-to-test boundary code
- •Make boundary objects "humble" - so simple they don't need testing
- •Move interesting logic to testable classes
- •Use at I/O boundaries, GUI boundaries, hardware interfaces
- •Example: Separate GUI formatting from business logic with a Presenter
On Mocking Frameworks
Uncle Bob's perspective:
- •"I usually don't use mocking frameworks"
- •Mocks are easy to write by hand
- •Hand-written mocks have nice names that make tests readable
- •Hand-written mocks can be reused across tests
- •Framework syntax can be "a jumble of dots and parentheses"
- •Use frameworks when you need power: sealed/final classes, private access
- •Useful in legacy environments
TDD Techniques
Fake It Till You Make It
- •Make a test pass by returning a constant (faking it)
- •Then make the next test pass by faking it a little less
- •Continue until you're not faking anymore
- •"Making incremental progress towards a goal is not wasting time - it's how you save time"
Stair-Step Tests
- •Write a test just to enable writing the next test
- •Delete it once it's served its purpose
- •Like digging stairs to reach the bottom of a hole
- •Example: First test creates the class, second test calls a method, third test actually tests behavior
Assert First
- •Write the test backwards, starting with the assert
- •Let compiler errors guide you to create what's needed
- •Forces you to think about the outcome first
- •Creates test from back to front
Triangulation
- •Write two tests that force you to generalize
- •First test can be passed with a constant
- •Second test requires actual logic
- •Used to drive generalization of production code
One-to-Many
- •Implement operations on collections by first implementing for one object
- •Make it work in singular case first
- •Then generalize to plural case
- •"Deal with one object first without even the array or the list"
Getting Stuck
Signs you're stuck:
- •Nothing incremental you can do to pass the test
- •Have to write the whole algorithm at once
Causes:
- •Wrote the wrong test (went for the gold too early)
- •Making production code too specific
- •Not approaching from outside in
Solutions:
- •Start with degenerate test cases (null, empty, single element)
- •Test peripheral issues first (validation, simple queries)
- •Avoid complicated test cases until simple ones are exhausted
- •"Engage as few brain cells as possible at any given moment"
Test Anti-Patterns
Fragile Tests
- •Tests that break when making small production code changes
- •Caused by coupling tests to implementation details
- •Violates Open-Closed Principle
- •"If you make a small change to production code and a bunch of tests break, you have fragile tests"
Tests That Know Too Much
- •Testing internal structure instead of behavior
- •Knowing about classes/methods that are implementation details
- •Testing private functions directly
- •Solution: Test through public interfaces only
Dirty Tests
Story: A client had "anything goes in tests" policy
- •Tests became fragile and brittle
- •Simple changes broke many tests
- •Tests were thrown away bit by bit
- •Production code rotted because they couldn't refactor
- •"If you start throwing away tests, you can't trust your test suite"
Giant Setups
- •One massive setup for all tests
- •Contains code only some tests need
- •Solution: Use hierarchical tests with targeted setups
Testing Private Functions
- •Generally don't test private functions
- •They're tested through public interface
- •Private functions come from refactoring - tests already cover them
- •If you MUST test private: "Testing trumps encapsulation" - promote to package/protected
Magic Numbers in Tests
- •Numbers whose significance is unclear
- •Solution: Use named constants that reflect requirements
- •Example:
MAX_REGULAR_HOURS_PER_DAYinstead of8
The Value of Tests
Tests Enable Refactoring
- •"We can't clean code until we eliminate the fear of change"
- •Tests eliminate fear by providing safety net
- •With tests, you can clean ugly code confidently
- •"The only way to go fast is to keep the code clean"
Tests as Documentation
- •Tests are code examples for the whole system
- •Show how to create objects, call APIs
- •Written in a language you understand
- •Utterly unambiguous
- •Cannot get out of sync with code (they execute)
- •"The perfect kind of low-level design document"
Tests as Specification
- •"If the tests pass, we ship"
- •Tests ARE the requirements
- •QA trusts the tests
- •Write tests you'd want to read
- •"Write the test that you'd want to read"
The Parable of the Two Disks
If you had production code on one disk and tests on another, and one crashed:
- •Lose tests: Cannot recreate them from production code (trapdoor function)
- •Lose production code: Can recreate from tests, probably better (second system effect)
- •"The tests are more critical than the production code"
- •Production code without tests WILL rot
TDD is Like Double-Entry Bookkeeping
- •Everything is said twice: once in tests, once in production code
- •They follow separate but complementary execution pathways
- •They meet at successful execution (green bar)
- •Accounting does this for sensitive financial data
- •Software is equally sensitive to single-bit errors
Rules of Simple Design (for Tests)
Kent Beck's rules (in priority order):
- •Pass all tests - Make it work
- •No duplication - Focus on structure
- •Express intent - Make it communicate
- •Minimize classes/methods - Optimize
For tests, the order changes:
- •Make tests expressive first (since you write tests first)
- •Then make them pass
- •Then clean up
"When we say test first, it means the tests come first - in writing, refactoring, cleaning, maintaining"
Process for Feature
$ARGUMENTS
Example Cycle
// RED: Write failing test
test('should return empty string for 0', () => {
expect(fizzBuzz(0)).toBe('');
});
// Run: FAIL - fizzBuzz is not defined
// GREEN: Minimal passing code
function fizzBuzz(n) {
return '';
}
// Run: PASS
// REFACTOR: (nothing to refactor yet)
// RED: Next test
test('should return "1" for 1', () => {
expect(fizzBuzz(1)).toBe('1');
});
// Run: FAIL - expected '1', got ''
// GREEN: Make it pass
function fizzBuzz(n) {
if (n === 0) return '';
return String(n);
}
// Run: PASS
// Continue the cycle...
Review Checklist
When reviewing tests, check:
- •
Three Laws Compliance
- •Was production code written only after failing test?
- •Are tests written in small increments?
- •Is only enough code written to pass current test?
- •
Test Cleanliness (F.I.R.S.T)
- •Are tests fast enough for rapid feedback?
- •Are tests independent of each other?
- •Are tests repeatable in any environment?
- •Do tests have clear pass/fail outcomes?
- •Were tests written before production code?
- •
Structure (AAA/Given-When-Then)
- •Clear separation of Arrange, Act, Assert?
- •Single logical assertion per test?
- •No act-assert-act-assert patterns?
- •
Mock Usage
- •Mocking only across boundaries?
- •Using appropriate test double type?
- •Not over-mocking (testing implementation)?
- •
Design Principles
- •Tests decoupled from implementation details?
- •Tests won't break from internal refactoring?
- •No knowledge of private/internal structure?
- •
Naming and Readability
- •Test names describe behavior, not implementation?
- •Magic numbers replaced with meaningful constants?
- •Tests readable as specifications?
Memorable Quotes
- •"Code rots because we're afraid to clean it."
- •"Tests eliminate the fear of change."
- •"If you follow these three laws, you'll be stuck in a cycle that is perhaps 20 seconds long."
- •"Little test, little code, little test, little code."
- •"Everything always worked a minute or so ago."
- •"Debugging is a skill not to be desired."
- •"If these tests pass, we ship."
- •"As the tests get more specific, the code gets more generic."
- •"Refactoring is the reward for making tests pass."
- •"Tests are part of your system - not extra, not ancillary, not inferior, not disposable."
- •"The tests are the requirements."
- •"Write the test that you'd want to read."
- •"Neither a statist nor a mockist be."
- •"Testing trumps encapsulation."
- •"The only way to go fast is to keep the code clean."
- •"TDD is double-entry bookkeeping for software."
Common Pitfalls
- •Writing too much test - Test only what's needed for the next step
- •Writing too much code - Only make the current test pass
- •Skipping refactoring - Technical debt accumulates
- •Testing implementation - Test behavior, not implementation details
- •Slow tests - Keep tests fast to maintain rapid cycles
Self-Review (Back Pressure)
After completing a TDD cycle, ALWAYS perform this self-review before presenting code as done:
Self-Review Steps
- •
Naming Check: Review all names against
/namingprinciples- •Do names reveal intent?
- •Are they pronounceable and searchable?
- •Correct parts of speech?
- •
Function Check: Review all functions against
/functionsprinciples- •Are functions small (4-6 lines ideal)?
- •Does each function do ONE thing?
- •Minimal arguments (0-2 ideal)?
- •
Professional Check: Review against
/professionalstandards- •Do I know what this code does?
- •Do I know that it works (tests prove it)?
- •Would I be proud of this code?
- •
If Violations Found:
- •Fix the violations
- •Re-run tests
- •Re-run self-review
- •
Only present as "done" when self-review passes
Mandatory Quality Gate
Code produced by TDD is NOT complete until:
- • All tests pass
- • Self-review completes with no violations
- •
/clean-code-reviewcan be run without finding critical issues
Related Skills
During the REFACTOR phase, leverage these complementary workflows:
- •/professional - Apply professional coding standards and quality checks
- •/naming - Use intention-revealing names for variables, functions, and classes
- •/functions - Structure functions properly with single responsibility and clean arguments
- •/clean-code-review - Run comprehensive review after TDD cycle completes