Systematic Debugging
ALWAYS find root cause before attempting fixes. Symptom fixes are failure.
When to Use
- •Encountering bugs or test failures
- •Unexpected behavior in code
- •Before proposing any fix
The Four Phases
Phase 1: Root Cause Investigation
Before proposing any solution:
- •Read error messages thoroughly - Don't skip warnings or stack traces; they often contain exact solutions
- •Reproduce consistently - Verify you can trigger the issue reliably with documented steps
- •Check recent changes - Examine
git diff, dependencies, and configuration modifications - •Gather diagnostic evidence - In multi-component systems, add instrumentation at component boundaries
- •Trace data flow - Backward trace from the error to find where bad values originate
Phase 2: Pattern Analysis
Establish the pattern before fixing:
- •Locate similar working code in the codebase
- •Read reference implementations completely (not skimmed)
- •List every difference between working and broken code
- •Understand all dependencies and assumptions
Phase 3: Hypothesis Testing
Apply scientific method:
- •State your hypothesis clearly: "I believe X is failing because Y, evidenced by Z"
- •Test with the smallest possible change
- •Change only ONE variable at a time
- •Verify results before proceeding
Phase 4: Implementation
Fix the root cause systematically:
- •Create a failing test case first (TDD)
- •Implement a single fix addressing only the root cause
- •Verify the fix resolves the issue without breaking other tests
- •If fix doesn't work, return to Phase 1
Red Flags - STOP Immediately
- •Proposing fixes without understanding the issue
- •Attempting multiple simultaneous changes
- •Assuming problems without verification
- •Skipping evidence gathering
- •Making "quick fixes" before investigation
When 3+ Fixes Fail
STOP. This signals an architectural problem, not a fixable bug:
- •Do not attempt another fix
- •Return to Phase 1
- •Question whether the underlying pattern/design is sound
- •Ask: "Should we refactor architecture vs. continue fixing symptoms?"
Random fixes waste time and create new bugs. Quick patches mask underlying issues.
Results
Systematic approach: 15-30 minutes to resolution with 95% first-time success vs. Trial-and-error: 2-3 hours of thrashing with 40% success and new bugs introduced