Systematic Debugging
Overview
4-phase process: reproduce, isolate, identify root cause, fix with test.
Core principle: Random fixes waste hours. Systematic tracing takes minutes.
Announce at start: "I'm using the nobody-debugs skill to trace this."
Phase 1: Reproduce
Before anything else, get a reliable reproduction:
- •Find the exact command/action that triggers the bug
- •Note the exact error message and behavior
- •Confirm it reproduces consistently
- •If intermittent, note frequency and conditions
Cannot reproduce? Gather more data before proceeding. Don't guess.
Phase 2: Isolate
Narrow down where the problem lives:
- •Binary search — Comment out half the code, does it still fail?
- •Simplify inputs — Find the minimal input that triggers the bug
- •Check boundaries — Which module/function/line is the boundary between working and broken?
- •Read error traces — Follow the stack trace to the actual failure point
Goal: "The bug is in function X, triggered when Y happens."
Phase 3: Root Cause
Trace backward from the symptom to the origin:
- •Start at the crash/error point
- •Ask: "What state caused this to fail?"
- •Trace that state backward: "Where was this value set?"
- •Repeat until you find the original cause
Root cause checklist:
- •Can you explain WHY the bug happens, not just WHERE?
- •If you fix this, will the symptom definitely disappear?
- •Is this the deepest cause, or just another symptom?
Common traps:
- •Fixing the symptom, not the cause
- •Stopping at the first suspicious thing
- •Assuming without evidence
Phase 4: Fix with Test
- •Write failing test reproducing the bug (TDD RED)
- •Watch it fail with the exact symptom
- •Apply minimal fix addressing root cause
- •Watch it pass (TDD GREEN)
- •Check for regressions — run full test suite
- •Add defense — validation, assertions, logging at the failure point
When Stuck
| Problem | Action |
|---|---|
| Can't reproduce | Add logging, try different environments |
| Too many variables | Isolate one variable at a time |
| Fix broke something else | Revert, understand dependencies first |
| "No root cause found" | 95% of the time = incomplete investigation |
Red Flags
- •Proposing fixes before understanding the root cause
- •"Let me try this and see if it works"
- •Changing multiple things at once
- •Not writing a regression test
- •Fixing symptoms instead of causes
Real-World Impact
- •Systematic approach: 15–30 minutes to fix
- •Random fixes approach: 2–3 hours of thrashing
- •First-time fix rate: ~95% vs ~40%
- •New bugs introduced: Near zero vs common