Hypothesis-Elimination Reasoning (HE)
Purpose: Systematic identification of root causes through evidence-based elimination. Unlike exploration methodologies (BoT, ToT), HE is designed to NARROW possibilities, not expand them.
When to Use Hypothesis-Elimination
✅ Use HE when:
- •Problem has ONE correct answer among many possibilities
- •Evidence can discriminate between hypotheses
- •Time-critical diagnosis (production incidents, debugging)
- •"What caused X?" questions
- •Differential diagnosis scenarios
❌ Don't use HE when:
- •Multiple solutions are equally valid (use BoT)
- •Need to optimize among options (use ToT)
- •No discriminating evidence available
- •Creative/generative problems
Examples:
- •"Why is the API returning 500 errors?" ✅
- •"What's causing memory leaks in production?" ✅
- •"Which database should we use?" ❌ (use ToT)
- •"What features should we build?" ❌ (use BoT)
Core Methodology: 5-Phase HEDAM Process
Phase 1: Hypothesis Generation (Diverge)
Goal: Generate ALL plausible hypotheses without filtering
Process:
- •
State the observable symptom precisely
- •
Generate hypotheses across ALL relevant categories:
- •Recent changes (code, config, infrastructure)
- •External dependencies (APIs, services, network)
- •Resource exhaustion (memory, CPU, disk, connections)
- •Data issues (corruption, volume, format)
- •Timing/race conditions
- •Security incidents
- •Human error
- •Unknown/novel causes
- •
For each hypothesis, note:
- •Mechanism: How would this cause the symptom?
- •Prior probability: Based on frequency in similar situations
- •Discriminating evidence: What would prove/disprove this?
Template:
## Hypothesis [N]: [Name] - **Mechanism**: [How this causes the symptom] - **Prior Probability**: [Low/Medium/High] - [Justification] - **Supporting Evidence Needed**: [What would increase probability] - **Eliminating Evidence Needed**: [What would rule this out]
Quantity: Generate 8-15 hypotheses. If fewer than 8, challenge assumptions.
Phase 2: Evidence Hierarchy Design
Goal: Design the most efficient evidence-gathering sequence
Principle: Gather DISCRIMINATING evidence first (evidence that eliminates multiple hypotheses)
Process:
- •
List all potential evidence sources:
- •Logs (application, system, network)
- •Metrics (CPU, memory, latency, error rates)
- •Recent changes (git log, deployment history)
- •Reproduction attempts
- •User reports / patterns
- •External status pages
- •
Score each evidence source:
- •Discrimination Power: How many hypotheses does this affect? (1-10)
- •Acquisition Cost: How long/difficult to obtain? (1-10, lower = easier)
- •Priority Score: Discrimination / Cost
- •
Rank evidence sources by priority score
- •
Design evidence-gathering sequence (highest priority first)
Example:
| Evidence Source | Discriminates | Cost | Priority | |-----------------|---------------|------|----------| | Error logs (last hour) | 8 hypotheses | 2 | 4.0 ⬅️ First | | Recent deployments | 5 hypotheses | 1 | 5.0 ⬅️ First | | Memory metrics | 3 hypotheses | 2 | 1.5 | | Network trace | 4 hypotheses | 6 | 0.67 | | Full reproduction | 10 hypotheses | 8 | 1.25 |
Phase 3: Systematic Elimination
Goal: Eliminate hypotheses through evidence, not intuition
Process: For each evidence source (in priority order):
- •
Gather evidence (read logs, check metrics, etc.)
- •
Update ALL hypotheses:
markdown### Evidence: [What was found] | Hypothesis | Impact | New Status | |------------|--------|------------| | H1: Memory leak | No memory growth seen | ELIMINATED | | H2: DB connection pool | Connection count normal | ELIMINATED | | H3: Slow external API | Latency spike at 14:32 | STRENGTHENED | | H4: Recent deployment | Deploy at 14:30 | STRENGTHENED |
- •
Track elimination count: Stop when 1-2 hypotheses remain
- •
Avoid confirmation bias: Actively seek evidence AGAINST remaining hypotheses
Elimination Criteria:
- •ELIMINATED: Evidence directly contradicts mechanism
- •WEAKENED: Evidence reduces probability but doesn't eliminate
- •UNCHANGED: Evidence doesn't affect this hypothesis
- •STRENGTHENED: Evidence increases probability
Phase 4: Confirmation Testing
Goal: Confirm the remaining hypothesis through targeted testing
Process:
- •
For the leading hypothesis, identify:
- •Prediction: If this is the cause, what else should we observe?
- •Test: How can we verify this prediction?
- •Expected result: What confirms the hypothesis?
- •
Execute confirmation test
- •
Evaluate:
- •CONFIRMED: Prediction matched, mechanism verified
- •PARTIAL: Some predictions matched, uncertainty remains
- •REFUTED: Prediction failed, reopen eliminated hypotheses
Confirmation Checklist:
- • Can we reproduce the issue with the identified cause?
- • Does fixing the cause resolve the symptom?
- • Does the timeline match (cause preceded symptom)?
- • Is the mechanism physically/logically possible?
Phase 5: Root Cause Documentation
Goal: Document findings for future reference and prevention
Template:
## Root Cause Analysis: [Issue Title] ### Summary - **Symptom**: [What was observed] - **Root Cause**: [Confirmed cause] - **Mechanism**: [How the cause produced the symptom] - **Timeline**: [When cause occurred, when symptom appeared] ### Elimination Path 1. Started with [N] hypotheses 2. [Evidence 1] eliminated [X] hypotheses 3. [Evidence 2] eliminated [Y] hypotheses 4. Confirmed via [Test] ### Hypotheses Considered and Eliminated | Hypothesis | Eliminated By | Key Evidence | |------------|---------------|--------------| | H1 | Evidence 1 | [Specific finding] | | H2 | Evidence 2 | [Specific finding] | ### Prevention - [ ] [Action to prevent recurrence] - [ ] [Monitoring to detect earlier] ### Confidence: [X]% - [Justification for confidence level]
Time-Critical Mode (Incident Response)
When time is critical, use accelerated HE:
5-Minute Triage:
- •Check last 3 deployments (30 sec)
- •Check external dependency status pages (30 sec)
- •Check error rate spike timing (1 min)
- •Check resource exhaustion (CPU, mem, disk) (1 min)
- •Check for similar recent incidents (1 min)
- •Form top-2 hypotheses (1 min)
Parallel Elimination:
- •Assign different team members to different evidence sources
- •Use chat/war room for real-time hypothesis updates
- •Timebox each investigation track (10 min max)
Common Mistakes
- •
Premature Convergence: Latching onto first plausible hypothesis
- •Fix: Force generation of 8+ hypotheses before investigating
- •
Confirmation Bias: Seeking evidence FOR favorite hypothesis
- •Fix: Actively try to DISPROVE remaining hypotheses
- •
Ignoring Low-Probability Causes: Novel causes get eliminated by assumption
- •Fix: Keep "Unknown/Novel" as permanent hypothesis until confirmed
- •
Evidence Tunnel Vision: Only looking at familiar evidence sources
- •Fix: Use the Evidence Hierarchy Design phase systematically
- •
Incomplete Elimination: Declaring victory with 3+ hypotheses remaining
- •Fix: Require 1-2 remaining before confirmation phase
Integration with Other Patterns
HE → SRC: After identifying root cause, use Self-Reflecting Chain to trace the exact failure path
BoT → HE: If problem is "what could go wrong?", use BoT first to generate failure modes, then HE when a failure occurs
HE → ToT: After finding root cause, use ToT to evaluate fix options
Confidence Calibration
| Remaining Hypotheses | Max Confidence |
|---|---|
| 1 (confirmed) | 90-95% |
| 2 (one leading) | 70-80% |
| 3+ | <60% - need more evidence |
| All eliminated | 0% - missing hypothesis, restart |
Quick Reference
HEDAM Process: H - Hypothesis Generation (8-15 possibilities) E - Evidence Hierarchy (prioritize discriminating evidence) D - Discrimination/Elimination (update all hypotheses per evidence) A - Assertion/Confirmation (test leading hypothesis) M - Memorialize (document for future)