Test Review - Quality Over Quantity

Evaluate tests for coverage quality, redundancy, and value. Every test should justify its existence.

Philosophy

The goal is confidence, not coverage percentage.

•A test that exercises the same code path as another test adds maintenance burden without value
•Tests should document behavior, not implementation details
•Fewer high-quality tests beat many low-quality tests

When to Use This Skill

•Before writing new tests for a function/module
•When reviewing a PR that adds tests
•When tests are failing and you're unsure if they're worth fixing
•When refactoring and deciding which tests to keep

Test Value Criteria

A test provides value when it:

•Covers a distinct execution path - Different branch, error condition, or state
•Documents expected behavior - Someone can read it to understand requirements
•Catches real bugs - Would fail if the code breaks in a meaningful way
•Is maintainable - Doesn't break on unrelated changes

A test does NOT provide value when it:

•Duplicates another test's path - Same branches, same assertions, different inputs
•Tests implementation details - Breaks when refactoring without behavior change
•Has flaky results - Sometimes passes, sometimes fails
•Tests framework/library code - Verifying that Go's append works

Evaluation Process

When reviewing tests, analyze each test case:

Step 1: Map Execution Paths

code

Function: ProcessOrder(order Order) error
Paths:
  1. order.Items is empty → return ErrEmptyOrder
  2. order.Total < 0 → return ErrInvalidTotal
  3. order.Customer is nil → return ErrNoCustomer
  4. happy path → process and return nil

Step 2: Map Test Cases to Paths

code

TestProcessOrder_EmptyItems     → Path 1 ✓
TestProcessOrder_NoItems        → Path 1 ✗ REDUNDANT
TestProcessOrder_ZeroItems      → Path 1 ✗ REDUNDANT
TestProcessOrder_NegativeTotal  → Path 2 ✓
TestProcessOrder_NilCustomer    → Path 3 ✓
TestProcessOrder_Success        → Path 4 ✓

Step 3: Identify Gaps and Redundancy

•Redundant: Multiple tests covering Path 1 with trivially different inputs
•Gap: No test for Path 2 with exactly zero total (boundary)
•Action: Remove redundant tests, add boundary test if meaningful

Table Tests: When and How

Use table tests when:

•Testing the same function with multiple valid inputs
•Each case exercises a DIFFERENT path or boundary
•Cases are truly independent

Structure table tests by path:

func TestProcessOrder(t *testing.T) {
    tests := []struct {
        name    string
        order   Order
        wantErr error
    }{
        // Error paths - one per distinct error condition
        {"empty items", Order{Items: nil}, ErrEmptyOrder},
        {"negative total", Order{Items: items, Total: -1}, ErrInvalidTotal},
        {"nil customer", Order{Items: items, Customer: nil}, ErrNoCustomer},

        // Happy path - minimal case that succeeds
        {"valid order", validOrder(), nil},
    }
    // ...
}

Anti-pattern: Redundant table entries

// BAD: These all test the same path (empty items)
{"nil items", Order{Items: nil}, ErrEmptyOrder},
{"empty slice", Order{Items: []Item{}}, ErrEmptyOrder},
{"zero length", Order{Items: make([]Item, 0)}, ErrEmptyOrder},

Boundary Testing

Test boundaries that matter:

Boundary	Test if...
Zero/Empty	Code has explicit zero handling
One element	Code has single-element special case
Max value	Code has upper limit logic
Nil vs Empty	Code distinguishes between them

Don't test boundaries that don't exist in the code.

Integration vs Unit Tests

Unit tests should:

•Test a single function/method in isolation
•Mock external dependencies
•Run fast (< 100ms)
•Be deterministic

Integration tests should:

•Test component interactions
•Use real dependencies where practical
•Cover critical user flows
•Be clearly marked (separate file, build tag)

One integration test covering a flow > many unit tests mocking everything

Red Flags in Test Code

1. Testing the mock

// BAD: This tests that mockDB.Get returns what we told it to
mockDB.On("Get", "key").Return("value", nil)
result, _ := mockDB.Get("key")
assert.Equal(t, "value", result) // Useless!

2. Assertion-heavy tests

// BAD: Testing implementation details
assert.Equal(t, 3, len(result.calls))
assert.Equal(t, "init", result.calls[0])
assert.Equal(t, "process", result.calls[1])
assert.Equal(t, "cleanup", result.calls[2])

// GOOD: Testing behavior
assert.NoError(t, result.Error())
assert.True(t, result.Completed())

3. Fragile setup

// BAD: Test breaks if any field is added to Config
config := Config{Field1: "a", Field2: "b", Field3: "c", ...}

// GOOD: Only set what matters for this test
config := Config{RelevantField: "value"}

4. Time-dependent tests

// BAD: Flaky
time.Sleep(100 * time.Millisecond)
assert.True(t, completed)

// GOOD: Synchronize properly
<-done
assert.True(t, completed)

Review Checklist

When /test-review is invoked, evaluate:

•Path coverage: Does each test cover a distinct execution path?
•Redundancy: Are there multiple tests covering the same path?
•Boundaries: Are tested boundaries actually present in the code?
•Maintainability: Will these tests break on refactors?
•Clarity: Can someone understand the requirements from the tests?
•Speed: Are unit tests fast? Are slow tests justified?

Output Format

When reviewing tests, provide:

markdown

## Test Review: [function/file]

### Execution Paths Identified
1. [path description]
2. [path description]
...

### Coverage Analysis
| Test Case | Path | Status |
|-----------|------|--------|
| TestX | 1 | ✓ Covers |
| TestY | 1 | ✗ Redundant with TestX |
| - | 3 | ✗ Gap - no coverage |

### Recommendations
- Remove: [TestY] - duplicates TestX
- Add: Test for [uncovered path]
- Refactor: [TestZ] tests implementation, not behavior

### Summary
[N] tests → [M] recommended (removed X redundant, added Y for gaps)

Guidelines

•One test per path - Additional tests for the same path rarely add value
•Test behavior, not implementation - Tests should survive refactors
•Boundaries only when meaningful - Don't test boundaries the code doesn't have
•Integration over excessive mocking - Real interactions catch real bugs
•Delete flaky tests - A flaky test is worse than no test
•Readability matters - Tests are documentation