Correctness Validation
When optimization hunting discovers a bug instead of an optimization opportunity, this skill validates the finding through tests. No bug fix should be merged without a test that would have caught it.
When This Skill Is Called
Other skills invoke /validate-correctness when:
- •
/hunt-optimizationdiscovers a bug instead of an optimization - •
/rescue-optimizationfinds broken code during salvage - •
/review-optimizationidentifies correctness issues during review
Optimization hunt discovers bug
│
▼
/validate-correctness
│
├── Create reproducing test (proves bug exists)
├── Validate fix (proves fix works)
├── Add fuzz test (catches variants)
└── Record in validations.yaml
│
▼
Return to calling skill with VALIDATED status
Philosophy
"A bug without a test is just an anecdote. A bug with a test is knowledge."
Finding bugs during optimization work is valuable, not a failure. But a bug fix without a reproducing test:
- •Can't prove the bug existed
- •Can't prove the fix works
- •Can regress silently later
Every bug fix MUST include a test that fails before the fix and passes after.
Phase 1: Understand the Bug
1.1 Document the Bug
bug: file: pkg/foo/bar.go function: ProcessItems discovered_by: hunt-optimization description: "make([]T, n) + append creates n zero elements before real data" impact: "PIDs have leading zeros, causes lookup failures" root_cause: "Confused make([]T, n) with make([]T, 0, n)"
1.2 Identify Bug Category
| Category | Example | Test Strategy |
|---|---|---|
| Off-by-one | Wrong slice bounds | Unit test with edge cases |
| Nil handling | Missing nil check | Unit test with nil input |
| Initialization | make([]T, n) + append | Unit test checking output |
| Concurrency | Race condition | Test with -race, fuzz test |
| Overflow | Integer overflow | Fuzz test with large values |
| Logic error | Wrong condition | Unit test with failing case |
1.3 Find the Minimal Reproducer
Identify the smallest input that triggers the bug:
// What input demonstrates the bug?
input := []int{1, 2, 3}
expected := []int{1, 2, 3}
actual := BuggyFunction(input)
// actual = []int{0, 0, 0, 1, 2, 3} // BUG: leading zeros
Phase 2: Create Reproducing Test
2.1 Write Test That Fails on Buggy Code
func TestFunctionName_BugDescription(t *testing.T) {
// Arrange: Setup that triggers the bug
input := createInputThatTriggersBug()
// Act: Call the buggy function
result := FunctionName(input)
// Assert: What SHOULD happen (will fail on buggy code)
expected := expectedCorrectOutput()
if !reflect.DeepEqual(result, expected) {
t.Errorf("BugDescription: got %v, want %v", result, expected)
}
}
2.2 Test Naming Convention
TestFunctionName_BugDescription TestProcessPIDs_NoLeadingZeros TestFlush_EmptyInputReturnsEmptySlice TestAppend_PreallocDoesNotPrependZeros
2.3 Verify Test Fails Before Fix
# Checkout code BEFORE fix git stash git checkout origin/main # Run the new test - MUST FAIL go test -run TestFunctionName_BugDescription ./pkg/path/... # Expected: FAIL # Return to fix branch git checkout - git stash pop
If test passes on buggy code, the test doesn't reproduce the bug. Rewrite it.
Phase 3: Validate the Fix
3.1 Verify Test Passes After Fix
# On the fix branch go test -run TestFunctionName_BugDescription ./pkg/path/... # Expected: PASS
3.2 Run Full Test Suite
# Ensure fix doesn't break anything else go test ./pkg/path/... go test -race ./pkg/path/...
3.3 Test Edge Cases
Add tests for related edge cases:
func TestFunctionName_EdgeCases(t *testing.T) {
tests := []struct {
name string
input InputType
expected OutputType
}{
{"empty input", nil, nil},
{"single element", []int{1}, []int{1}},
{"typical case", []int{1,2,3}, []int{1,2,3}},
{"large input", makeLargeInput(), expectedLargeOutput()},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := FunctionName(tt.input)
if !reflect.DeepEqual(got, tt.expected) {
t.Errorf("got %v, want %v", got, tt.expected)
}
})
}
}
Phase 4: Add Fuzz Test (When Appropriate)
4.1 When to Add Fuzz Tests
| Bug Type | Fuzz Test? | Reason |
|---|---|---|
| Input parsing | YES | Many edge cases |
| Serialization | YES | Format variations |
| Numeric operations | YES | Overflow, underflow |
| String manipulation | YES | Unicode, empty, long |
| Simple logic error | No | Unit test sufficient |
| Nil handling | No | Explicit cases enough |
4.2 Fuzz Test Template
func FuzzFunctionName(f *testing.F) {
// Seed corpus with known interesting inputs
f.Add([]byte{})
f.Add([]byte{1, 2, 3})
f.Add([]byte{0, 0, 0})
f.Fuzz(func(t *testing.T, data []byte) {
// Should not panic
result := FunctionName(data)
// Invariants that must always hold
if result == nil && len(data) > 0 {
t.Error("non-empty input should not produce nil")
}
// Round-trip check (if applicable)
if !isValidOutput(result) {
t.Errorf("invalid output for input %v", data)
}
})
}
4.3 Run Fuzz Test
# Quick fuzz (find obvious issues) go test -fuzz=FuzzFunctionName -fuzztime=30s ./pkg/path/... # Longer fuzz (thorough exploration) go test -fuzz=FuzzFunctionName -fuzztime=5m ./pkg/path/...
Phase 5: Document and Record
5.1 Commit Message Format
git commit -m "$(cat <<'EOF' fix(pkg): description of bug fix Bug: make([]T, n) with append prepends n zero elements Fix: Use make([]T, 0, n) for correct preallocation Test: TestProcessPIDs_NoLeadingZeros fails before, passes after Fuzz: FuzzProcessPIDs added for input variations Discovered-by: /hunt-optimization Validated-by: /validate-correctness 🤖 Generated with Claude Code EOF )"
5.2 MANDATORY: Record in validations.yaml
- bug_id: containerd-pids-leading-zeros
date: 2026-01-06
file: pkg/util/containers/containerd/containerd_util.go
function: ListRunningProcesses
category: initialization
description: "make([]T, n) + append prepends n zeros"
discovered_by: hunt-optimization
original_branch: mem-opt/containerd-pids-fix-rescued
tests_added:
- TestListRunningProcesses_NoLeadingZeros
- TestListRunningProcesses_EdgeCases
fuzz_added: false
verified_fails_before: true
verified_passes_after: true
lesson: "Always use make([]T, 0, n) when building slice via append"
Phase 6: Return to Calling Skill
After validation complete, return status to the calling skill:
validation_result: status: VALIDATED # or INVALID, NEEDS_WORK bug_confirmed: true fix_confirmed: true tests_added: 2 fuzz_added: false ready_for_merge: true
The calling skill should:
- •Record the bug discovery as a SUCCESS (not failure)
- •Include validation status in the review
- •Proceed with merge if VALIDATED
Usage
/validate-correctness
Checklist
Before returning VALIDATED:
- • Bug documented with root cause
- • Reproducing test written
- • Test FAILS on buggy code (verified)
- • Test PASSES on fixed code (verified)
- • Edge case tests added
- • Fuzz test added (if appropriate)
- • Full test suite passes
- • Race detector passes (
go test -race) - • Recorded in validations.yaml