AgentSkillsCN

debugging-methodology

基于证据的调试,遵循铁律纪律。先仪器再猜测,先追踪再理论。适用场景:遇到任何bug、测试失败或意外行为——在提出修复方案前。

SKILL.md
--- frontmatter
name: debugging-methodology
description: "Evidence-based debugging with Iron Law discipline. Instrument before guessing, trace before theorizing. Use when encountering any bug, test failure, or unexpected behavior - before proposing fixes."
version: 1.1.0

Debugging Methodology

Stop guessing. Add instrumentation. Follow the evidence.

The Iron Law

code
NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST

If you haven't gathered evidence, you cannot propose fixes.

Rationale: Random fixes waste time and create new bugs. Quick patches mask underlying issues. Systematic debugging is FASTER than guess-and-check thrashing.

Core Principle

Measure, don't speculate. When something fails, the answer is never "maybe it's X" followed by random changes. The answer is: add instrumentation that produces the specific data needed to explain the failure.

The Anti-Pattern

code
Problem occurs
  → "Maybe it's a race condition"
  → Change something random
  → Still broken
  → "Could be caching"
  → Change something else
  → User frustrated, hours wasted

Why this happens: Insufficient data. You're debugging blind.

The Protocol

1. Document the Symptom Exactly

typescript
// WRONG: "The API returns an error"
// RIGHT: Exact symptom with context

/*
Symptom: getUser returns err('DB_ERROR')
Expected: ok(user) with { id: '123', name: 'Alice' }
Actual: err('DB_ERROR')
Context: userId='123', environment=test, after migration
Error message: "Connection refused to localhost:5432"
*/

Capture:

  • Exact error message (copy-paste, don't paraphrase)
  • Expected vs actual behavior
  • Input values that triggered the failure
  • Environment context

2. Add Instrumentation FIRST

Before forming ANY hypothesis, make the invisible visible:

typescript
// BEFORE: Silent failure
async function getUser(
  args: { userId: string },
  deps: GetUserDeps
): Promise<Result<User, 'NOT_FOUND' | 'DB_ERROR'>> {
  const user = await deps.db.findUser(args.userId);
  if (!user) return err('NOT_FOUND');
  return ok(user);
}

// AFTER: Observable failure
async function getUser(
  args: { userId: string },
  deps: GetUserDeps
): Promise<Result<User, 'NOT_FOUND' | 'DB_ERROR'>> {
  deps.logger.debug('getUser called', { userId: args.userId });

  try {
    deps.logger.debug('Calling db.findUser', { userId: args.userId });
    const user = await deps.db.findUser(args.userId);
    deps.logger.debug('db.findUser returned', {
      userId: args.userId,
      found: !!user,
      userKeys: user ? Object.keys(user) : null
    });

    if (!user) {
      deps.logger.info('User not found', { userId: args.userId });
      return err('NOT_FOUND');
    }

    deps.logger.debug('Returning user', { userId: user.id });
    return ok(user);
  } catch (error) {
    deps.logger.error('Database error in getUser', {
      userId: args.userId,
      error: error instanceof Error ? error.message : String(error),
      stack: error instanceof Error ? error.stack : undefined
    });
    return err('DB_ERROR');
  }
}

3. Instrument at Decision Points

Add logging at every branch, every external call, every state change:

PointWhat to Log
Function entryAll args, relevant deps state
External callsRequest params, before and after
ConditionalsWhich branch taken, why
TransformationsInput shape, output shape
Error pathsFull error object, context
Function exitReturn value, duration

4. Use OpenTelemetry Spans

For complex flows, use tracing:

typescript
import { trace } from 'autotel';

async function processOrder(
  args: { orderId: string },
  deps: ProcessOrderDeps
): Promise<Result<Order, OrderError>> {
  return trace('processOrder', { orderId: args.orderId }, async (span) => {
    span.setAttribute('order.id', args.orderId);

    const validationResult = await trace('validateOrder', async () => {
      return validateOrder(args, deps);
    });

    if (!validationResult.ok) {
      span.setAttribute('error.type', validationResult.error);
      return validationResult;
    }

    const paymentResult = await trace('processPayment', async () => {
      return processPayment({ order: validationResult.value }, deps);
    });

    span.setAttribute('payment.success', paymentResult.ok);
    return paymentResult;
  });
}

5. Form Evidence-Based Hypothesis

Only AFTER you have data:

code
Evidence from logs:
- getUser called with userId='123' ✓
- db.findUser called ✓
- db.findUser threw: "Connection refused to localhost:5432"

Hypothesis: Database connection failed
Evidence: Error message explicitly says "Connection refused"
Test: Check if database is running, check connection string

Hypothesis must:

  • Be based on observed data (not speculation)
  • Explain ALL symptoms
  • Be testable with a single change

6. One Change at a Time

code
WRONG:
  → Change connection string AND add retry AND increase timeout
  → "It works now!" (but which fix was it?)

RIGHT:
  → Check database is running (it wasn't)
  → Start database
  → Retest → Works
  → Root cause: Database container wasn't started

Result Type Debugging

When debugging Result-returning functions:

Check the Error Path

typescript
// Add explicit logging for err() returns
if (!user) {
  const errorContext = {
    userId: args.userId,
    dbQueryExecuted: true,
    queryDuration: Date.now() - startTime,
  };
  deps.logger.info('Returning NOT_FOUND', errorContext);
  return err('NOT_FOUND');
}

Trace Through Workflows

typescript
// When createWorkflow fails, add step-level logging
const workflow = createWorkflow()
  .pipe(step('validate', async (input) => {
    deps.logger.debug('Step: validate', { input });
    const result = await validate(input, deps);
    deps.logger.debug('Step: validate result', { ok: result.ok });
    return result;
  }))
  .pipe(step('process', async (input) => {
    deps.logger.debug('Step: process', { input });
    const result = await process(input, deps);
    deps.logger.debug('Step: process result', { ok: result.ok });
    return result;
  }));

Verify Deps Are Correct

typescript
// When mocking in tests, verify deps behavior
it('returns NOT_FOUND when user missing', async () => {
  const deps = mock<GetUserDeps>();
  deps.db.findUser.mockResolvedValue(null);

  // Debug: verify mock is configured correctly
  console.log('Mock configured:', {
    findUser: deps.db.findUser.getMockName(),
    mockReturnValue: await deps.db.findUser('test'),
  });

  const result = await getUser({ userId: '123' }, deps);

  // Debug: verify what was called
  console.log('findUser called with:', deps.db.findUser.mock.calls);
  console.log('Result:', result);

  expect(result.ok).toBe(false);
});

Debugging Checklist

Before guessing, answer these:

QuestionHow to Answer
What are the input values?Log function entry
Which code path executed?Log at conditionals
What did external calls return?Log before/after deps calls
What error was thrown?Log in catch block with stack
What was the return value?Log function exit

Anti-Patterns

Never Guess

code
WRONG: "It's probably a race condition"
RIGHT: "Let me add timestamps to see the execution order"

WRONG: "Maybe the mock isn't working"
RIGHT: "Let me log what the mock returns when called"

WRONG: "Could be a caching issue"
RIGHT: "Let me log cache hits/misses with keys"

Never Change Multiple Things

code
WRONG: Fix A, B, and C → "It works!" (which one fixed it?)
RIGHT: Fix A → Test → Still broken → Fix B → Test → Works (B was the issue)

Never Remove Instrumentation Prematurely

code
WRONG: Problem seems fixed → Delete all logging
RIGHT: Problem confirmed fixed → Keep structured logging, remove debug-level only

Never Assume Code Does What It Says

typescript
// Function is CALLED getUser but does it RETURN user?
async function getUser(args, deps) {
  // Don't assume. Log the actual return value.
  const result = /* ... */;
  deps.logger.debug('getUser returning', { result });
  return result;
}

Integration with Testing

Debug Failing Tests Systematically

typescript
it('processes order correctly', async () => {
  // 1. Log test setup
  console.log('=== TEST: processes order correctly ===');

  const deps = mock<ProcessOrderDeps>();
  deps.db.findOrder.mockResolvedValue(mockOrder);
  console.log('Deps configured:', {
    findOrderReturns: mockOrder
  });

  // 2. Log function call
  console.log('Calling processOrder with:', { orderId: '123' });
  const result = await processOrder({ orderId: '123' }, deps);

  // 3. Log result before assertion
  console.log('Result:', JSON.stringify(result, null, 2));

  // 4. Now assert
  expect(result.ok).toBe(true);
});

Verify Mock Interactions

typescript
// After test fails, check what was actually called
console.log('Mock calls:', {
  findOrder: deps.db.findOrder.mock.calls,
  saveOrder: deps.db.saveOrder.mock.calls,
  sendEmail: deps.mailer.send.mock.calls,
});

Decision Tree

code
Problem occurs
    │
    ▼
Can you see exact failure point?
    NO → Add function entry/exit logging
    YES ↓
Do you know input values at failure?
    NO → Log all args and deps state
    YES ↓
Do you know which code path executed?
    NO → Log at every conditional
    YES ↓
Do you know what external calls returned?
    NO → Log before/after deps calls
    YES ↓
Do you know the error details?
    NO → Log full error with stack
    YES ↓
Form hypothesis from evidence → Test with ONE change

Quick Reference

SymptomInstrumentation to Add
Wrong return valueLog at every return statement
Function never calledLog at caller and callee entry
External call failsLog request params and response
Intermittent failureAdd timestamps, log state
Test passes locally, fails in CILog environment, deps versions
Result is err() unexpectedlyLog at every err() return

Debugging Workflows

When a createWorkflow() pipeline fails, debug systematically:

1. Identify Which Step Failed

typescript
const workflow = createWorkflow({ validateOrder, processPayment, sendConfirmation });

const result = await workflow(async (step) => {
  deps.logger.debug('Starting workflow');

  const validated = await step(() => validateOrder(args, deps));
  deps.logger.debug('validateOrder completed', { ok: validated !== undefined });

  const payment = await step(() => processPayment({ order: validated }, deps));
  deps.logger.debug('processPayment completed', { ok: payment !== undefined });

  const confirmation = await step(() => sendConfirmation({ order: validated, payment }, deps));
  deps.logger.debug('sendConfirmation completed', { ok: confirmation !== undefined });

  return { order: validated, confirmation };
});

if (!result.ok) {
  deps.logger.error('Workflow failed', { error: result.error });
  // Error type tells you which step failed
}

2. Use Existing Traces

If you have OpenTelemetry set up (see observability skill), use existing traces:

code
In Jaeger/Honeycomb:
1. Find the trace by traceId or time range
2. Look for the RED span (error status)
3. Expand span attributes for error details
4. Check parent spans for context

Query example:
  service.name = "my-service"
  status = ERROR
  span.kind = INTERNAL

3. Debug step.try() Failures

When third-party code throws:

typescript
const config = await step.try(
  () => {
    deps.logger.debug('Parsing config JSON', { raw: user.configJson });
    return JSON.parse(user.configJson);
  },
  {
    error: 'INVALID_CONFIG' as const,
    onError: (e) => {
      // Log the actual exception before it's converted to Result
      deps.logger.error('JSON.parse failed', {
        error: e instanceof Error ? e.message : String(e),
        raw: user.configJson?.substring(0, 100),  // First 100 chars
      });
    },
  }
);

Integration with Observability

When debugging in production, leverage your existing traces:

  1. Find the trace - Use traceId from logs or error reports
  2. Walk the spans - Each step() creates a child span
  3. Check attributes - Look for error.type, error.message
  4. Correlate with logs - Use traceId to find related log entries
typescript
// In your error handler, include trace context
app.use((err, req, res, next) => {
  const span = trace.getSpan(context.active());
  logger.error('Request failed', {
    error: err.message,
    traceId: span?.spanContext().traceId,
    spanId: span?.spanContext().spanId,
    path: req.path,
  });
  res.status(500).json({ error: 'Internal error', traceId: span?.spanContext().traceId });
});

Red Flags - STOP and Follow Process

If you catch yourself thinking:

  • "Quick fix for now, investigate later"
  • "Just try changing X and see if it works"
  • "Add multiple changes, run tests"
  • "It's probably X, let me fix that"
  • "I don't fully understand but this might work"
  • Proposing solutions before tracing data flow
  • "One more fix attempt" (when already tried 2+)

ALL of these mean: STOP. Return to gathering evidence.

The 3-Fix Rule

If 3+ fixes failed:

  1. STOP attempting more fixes
  2. Question the architecture
  3. Consider: Is the pattern fundamentally sound?
  4. Discuss with team before attempting more fixes

Pattern indicating architectural problem:

  • Each fix reveals new problems in different places
  • Fixes require massive refactoring
  • Each fix creates new symptoms elsewhere

Rationalization Prevention

ExcuseReality
"Issue is simple, don't need process"Simple issues have root causes too
"Emergency, no time for process"Systematic debugging is FASTER than thrashing
"Just try this first, then investigate"First fix sets the pattern. Do it right.
"Multiple fixes at once saves time"Can't isolate what worked. Causes new bugs.
"I see the problem, let me fix it"Seeing symptoms ≠ understanding root cause
"One more fix attempt" (after 2+ failures)3+ failures = architectural problem

Remember

  • Evidence before hypothesis - Don't guess, measure
  • One change at a time - Know what fixed it
  • Structured logging pays off - Pino + OpenTelemetry
  • Keep observability - Debug logs become production traces
  • Use existing traces - Don't re-instrument if traces already exist