Correctness Validation

Name: validate-correctness
Rating: 92
Author: blt

When optimization hunting discovers a bug instead of an optimization opportunity, this skill validates the finding through tests. No bug fix should be merged without a test that would have caught it.

When This Skill Is Called

Other skills invoke /validate-correctness when:

•/hunt-optimization discovers a bug instead of an optimization
•/rescue-optimization finds broken code during salvage
•/review-optimization identifies correctness issues during review

code

Optimization hunt discovers bug
            │
            ▼
    /validate-correctness
            │
            ├── Create reproducing test (proves bug exists)
            ├── Validate fix (proves fix works)
            ├── Add fuzz test (catches variants)
            └── Record in validations.yaml
            │
            ▼
    Return to calling skill with VALIDATED status

Philosophy

"A bug without a test is just an anecdote. A bug with a test is knowledge."

Finding bugs during optimization work is valuable, not a failure. But a bug fix without a reproducing test:

•Can't prove the bug existed
•Can't prove the fix works
•Can regress silently later

Every bug fix MUST include a test that fails before the fix and passes after.

Phase 1: Understand the Bug

1.1 Document the Bug

yaml

bug:
  file: pkg/foo/bar.go
  function: ProcessItems
  discovered_by: hunt-optimization
  description: "make([]T, n) + append creates n zero elements before real data"
  impact: "PIDs have leading zeros, causes lookup failures"
  root_cause: "Confused make([]T, n) with make([]T, 0, n)"

1.2 Identify Bug Category

Category	Example	Test Strategy
Off-by-one	Wrong slice bounds	Unit test with edge cases
Nil handling	Missing nil check	Unit test with nil input
Initialization	make([]T, n) + append	Unit test checking output
Concurrency	Race condition	Test with `-race`, fuzz test
Overflow	Integer overflow	Fuzz test with large values
Logic error	Wrong condition	Unit test with failing case

1.3 Find the Minimal Reproducer

Identify the smallest input that triggers the bug:

// What input demonstrates the bug?
input := []int{1, 2, 3}
expected := []int{1, 2, 3}
actual := BuggyFunction(input)
// actual = []int{0, 0, 0, 1, 2, 3}  // BUG: leading zeros

Phase 2: Create Reproducing Test

2.1 Write Test That Fails on Buggy Code

func TestFunctionName_BugDescription(t *testing.T) {
    // Arrange: Setup that triggers the bug
    input := createInputThatTriggersBug()

    // Act: Call the buggy function
    result := FunctionName(input)

    // Assert: What SHOULD happen (will fail on buggy code)
    expected := expectedCorrectOutput()
    if !reflect.DeepEqual(result, expected) {
        t.Errorf("BugDescription: got %v, want %v", result, expected)
    }
}

2.2 Test Naming Convention

code

TestFunctionName_BugDescription
TestProcessPIDs_NoLeadingZeros
TestFlush_EmptyInputReturnsEmptySlice
TestAppend_PreallocDoesNotPrependZeros

2.3 Verify Test Fails Before Fix

bash

# Checkout code BEFORE fix
git stash
git checkout origin/main

# Run the new test - MUST FAIL
go test -run TestFunctionName_BugDescription ./pkg/path/...
# Expected: FAIL

# Return to fix branch
git checkout -
git stash pop

If test passes on buggy code, the test doesn't reproduce the bug. Rewrite it.

Phase 3: Validate the Fix

3.1 Verify Test Passes After Fix

bash

# On the fix branch
go test -run TestFunctionName_BugDescription ./pkg/path/...
# Expected: PASS

3.2 Run Full Test Suite

bash

# Ensure fix doesn't break anything else
go test ./pkg/path/...
go test -race ./pkg/path/...

3.3 Test Edge Cases

Add tests for related edge cases:

func TestFunctionName_EdgeCases(t *testing.T) {
    tests := []struct {
        name     string
        input    InputType
        expected OutputType
    }{
        {"empty input", nil, nil},
        {"single element", []int{1}, []int{1}},
        {"typical case", []int{1,2,3}, []int{1,2,3}},
        {"large input", makeLargeInput(), expectedLargeOutput()},
    }

    for _, tt := range tests {
        t.Run(tt.name, func(t *testing.T) {
            got := FunctionName(tt.input)
            if !reflect.DeepEqual(got, tt.expected) {
                t.Errorf("got %v, want %v", got, tt.expected)
            }
        })
    }
}

Phase 4: Add Fuzz Test (When Appropriate)

4.1 When to Add Fuzz Tests

Bug Type	Fuzz Test?	Reason
Input parsing	YES	Many edge cases
Serialization	YES	Format variations
Numeric operations	YES	Overflow, underflow
String manipulation	YES	Unicode, empty, long
Simple logic error	No	Unit test sufficient
Nil handling	No	Explicit cases enough

4.2 Fuzz Test Template

func FuzzFunctionName(f *testing.F) {
    // Seed corpus with known interesting inputs
    f.Add([]byte{})
    f.Add([]byte{1, 2, 3})
    f.Add([]byte{0, 0, 0})

    f.Fuzz(func(t *testing.T, data []byte) {
        // Should not panic
        result := FunctionName(data)

        // Invariants that must always hold
        if result == nil && len(data) > 0 {
            t.Error("non-empty input should not produce nil")
        }

        // Round-trip check (if applicable)
        if !isValidOutput(result) {
            t.Errorf("invalid output for input %v", data)
        }
    })
}

4.3 Run Fuzz Test

bash

# Quick fuzz (find obvious issues)
go test -fuzz=FuzzFunctionName -fuzztime=30s ./pkg/path/...

# Longer fuzz (thorough exploration)
go test -fuzz=FuzzFunctionName -fuzztime=5m ./pkg/path/...

Phase 5: Document and Record

5.1 Commit Message Format

bash

git commit -m "$(cat <<'EOF'
fix(pkg): description of bug fix

Bug: make([]T, n) with append prepends n zero elements
Fix: Use make([]T, 0, n) for correct preallocation

Test: TestProcessPIDs_NoLeadingZeros fails before, passes after
Fuzz: FuzzProcessPIDs added for input variations

Discovered-by: /hunt-optimization
Validated-by: /validate-correctness

🤖 Generated with Claude Code
EOF
)"

5.2 MANDATORY: Record in validations.yaml

yaml

  - bug_id: containerd-pids-leading-zeros
    date: 2026-01-06
    file: pkg/util/containers/containerd/containerd_util.go
    function: ListRunningProcesses
    category: initialization
    description: "make([]T, n) + append prepends n zeros"
    discovered_by: hunt-optimization
    original_branch: mem-opt/containerd-pids-fix-rescued
    tests_added:
      - TestListRunningProcesses_NoLeadingZeros
      - TestListRunningProcesses_EdgeCases
    fuzz_added: false
    verified_fails_before: true
    verified_passes_after: true
    lesson: "Always use make([]T, 0, n) when building slice via append"

Phase 6: Return to Calling Skill

After validation complete, return status to the calling skill:

yaml

validation_result:
  status: VALIDATED  # or INVALID, NEEDS_WORK
  bug_confirmed: true
  fix_confirmed: true
  tests_added: 2
  fuzz_added: false
  ready_for_merge: true

The calling skill should:

•Record the bug discovery as a SUCCESS (not failure)
•Include validation status in the review
•Proceed with merge if VALIDATED

Usage

code

/validate-correctness

Checklist

Before returning VALIDATED:

• Bug documented with root cause
• Reproducing test written
• Test FAILS on buggy code (verified)
• Test PASSES on fixed code (verified)
• Edge case tests added
• Fuzz test added (if appropriate)
• Full test suite passes
• Race detector passes (go test -race)
• Recorded in validations.yaml