Tree of Thoughts Reasoning Methodology
Purpose: Systematic parallel exploration of solution spaces through recursive branching, self-reflection, and rigorous evaluation. Use this methodology when facing complex problems with multiple viable solution paths.
When to Use Tree of Thoughts
✅ Use ToT when:
- •Problem has multiple viable solution approaches (3+ fundamentally different paths)
- •Need to find optimal solution, not just any solution
- •Can define clear evaluation criteria
- •Complexity justifies systematic exploration
- •Trade-offs exist between competing approaches
- •Strategic or architectural decisions with long-term impact
❌ Don't use ToT when:
- •Problem has obvious single solution path
- •Time-critical decisions with simple trade-offs
- •Problem is well-defined with standard solution
- •Exploratory work where breadth matters more than depth
Examples:
- •"Should we use REST, GraphQL, or gRPC?" (3 paths, clear trade-offs) ✅
- •"Design distributed caching system balancing latency, consistency, cost" (multi-dimensional) ✅
- •"Fix this syntax error" (single path) ❌
- •"Research all available databases" (breadth-of-thought better) ❌
Core Methodology: 5-Step Process
Step 1: Problem Decomposition (5+ Branches)
Objective: Identify 5+ fundamentally different approaches to explore
Actions:
- •Analyze problem to identify key dimensions (technical, organizational, risk, cost, timeline)
- •Brainstorm 5-10 distinct approaches (not variations of same approach)
- •Define evaluation criteria from problem constraints
- •Validate diversity: Each branch explores different solution philosophy
Example (Distributed Caching):
Branch A: Write-through consistency (strong consistency, higher latency) Branch B: Eventual consistency (performance, weaker guarantees) Branch C: Hybrid tiered (hot data write-through, cold data eventual) Branch D: Edge-centric (CDN-style, geography-aware) Branch E: Cost-optimized minimal (single region, no replication)
Deliverable: 5+ distinct approach definitions
Step 2: Parallel Branch Exploration
Objective: Explore each branch systematically with self-reflection
For each branch:
- •Analyze approach against problem requirements
- •Consider strengths, weaknesses, trade-offs
- •Identify assumptions and constraints
- •End with self-reflection (see template below)
Self-Reflection Template (REQUIRED for each branch):
## Branch [X]: [Approach Name] [Analysis of this approach: 2-4 paragraphs covering requirements, strengths, weaknesses, trade-offs] ### Self-Reflection - **Confidence**: [0-100]/100 - **Strengths**: [What makes this approach compelling] - **Weaknesses**: [Gaps, assumptions, limitations] - **Trade-offs**: [What you gain vs what you lose] - **Recommendation**: [Continue deeper exploration? Prune? Why?]
Execution Options:
- •With Task tool: Spawn 5+ parallel tasks for independent exploration
- •Without Task tool: Explore branches sequentially using TodoWrite to track progress
- •Hybrid: Use Task for complex branches, sequential for simple ones
Deliverable: 5+ explored branches with self-reflections
Step 3: Branch Evaluation (Scoring)
Objective: Systematically evaluate all branches against criteria
Evaluation Criteria (100 points total, 5 categories × 20 points):
- •
Novelty (0-20): Does it explore new solution space vs obvious approaches?
- •18-20: Innovative approach, fresh perspective
- •12-17: Good approach with some novel elements
- •6-11: Standard approach, minor tweaks
- •0-5: Obvious/conventional approach
- •
Feasibility (0-20): Practically implementable with reasonable resources?
- •18-20: Proven technology, clear implementation path
- •12-17: Feasible with moderate effort/risk
- •6-11: Significant technical challenges
- •0-5: Impractical or resource-intensive
- •
Completeness (0-20): Addresses all stated requirements?
- •18-20: Covers all requirements comprehensively
- •12-17: Covers most requirements, minor gaps
- •6-11: Missing key requirements
- •0-5: Incomplete solution
- •
Confidence (0-20): Branch's self-reflection confidence score?
- •18-20: High confidence (80-100%) with justification
- •12-17: Medium confidence (60-79%)
- •6-11: Low confidence (40-59%)
- •0-5: Very low confidence (<40%)
- •
Alignment (0-20): Matches problem constraints and context?
- •18-20: Perfect fit for constraints
- •12-17: Good fit, minor misalignment
- •6-11: Notable misalignment
- •0-5: Poor fit for context
Scoring Process:
- •Review each branch's analysis and self-reflection
- •Score each branch on all 5 criteria (0-20 per criterion)
- •Calculate total score (0-100) for each branch
- •Rank branches by total score
- •Select highest-scoring branch for deeper exploration
Deliverable: Scored ranking of all branches, winner selected
Step 4: Recursive Depth Exploration (Level 1+)
Objective: Recursively expand the best branch
Actions:
- •Take highest-scoring branch from Step 3
- •Decompose that branch into 5+ sub-approaches or refinements
- •Repeat Steps 2-3 for the new level (explore → evaluate → select)
- •Continue recursion until stopping criteria met
Minimum Depth: 4 levels (Level 0 → 1 → 2 → 3)
Level Transition Example:
Level 0: "Distributed caching system" (5 approaches) → Winner: Branch B (Eventual consistency) Level 1: "Eventual consistency variants" (5 refinements) - B.1: Last-write-wins - B.2: Version vectors - B.3: CRDTs - B.4: Causal consistency - B.5: Session consistency → Winner: Branch B.3 (CRDTs) Level 2: "CRDT implementations" (5 options) - B.3.1: G-Counter - B.3.2: PN-Counter - B.3.3: LWW-Element-Set - B.3.4: OR-Set - B.3.5: RGA (Replicated Growable Array) → Winner: Branch B.3.4 (OR-Set) Level 3: "OR-Set optimizations" (5 variants) [Explore specific implementation strategies] → Winner: Branch B.3.4.2 (Tombstone compaction)
Deliverable: Recursive tree with minimum 4 levels explored
Step 5: Final Synthesis
Objective: Synthesize insights into final recommendation
Actions:
- •Trace winning path: Document Level 0 → Level 1 → Level 2 → Level 3+
- •Extract key insights: What was learned at each level?
- •Document pruned branches: Why were alternatives discarded?
- •Calculate confidence: Final confidence score (see Bayesian formula below)
- •State assumptions: What assumptions underpin the recommendation?
- •Provide recommendation: Clear, actionable guidance
Synthesis Template:
## Tree of Thoughts Analysis Complete ### Winning Path - **Level 0**: [Chosen approach] (Score: X/100) - **Level 1**: [Refinement] (Score: X/100) - **Level 2**: [Sub-refinement] (Score: X/100) - **Level 3**: [Implementation] (Score: X/100) ### Key Insights 1. [Insight from Level 0] 2. [Insight from Level 1] 3. [Insight from Level 2] 4. [Insight from Level 3] ### Alternatives Considered - [Branch A]: Pruned because [reason] - [Branch C]: Pruned because [reason] - [Branch D]: Pruned because [reason] ### Final Confidence: [X]% **Justification**: [Why this confidence level based on exploration depth, evidence, and remaining uncertainties] ### Recommendation [Clear, actionable recommendation with next steps] ### Remaining Uncertainties - [Assumption 1] - [Assumption 2]
Deliverable: Comprehensive synthesis with traced path and confidence score
Stopping Criteria
Stop exploration when ANY of:
- •✅ Reached 4+ levels AND best branch confidence >80%
- •✅ Reached 6 levels (maximum recommended depth)
- •✅ All branches converge to same solution across multiple levels
- •✅ Diminishing returns (Level N scores similar to Level N-1)
Warning signs (don't stop yet):
- •❌ Only 2-3 levels explored
- •❌ Confidence <80% without clear reason
- •❌ Winner not clearly superior to alternatives
Bayesian Confidence Scoring
Purpose: Quantify confidence based on accumulated evidence
Formula:
Prior Odds = P(correct) / (1 - P(correct)) Likelihood Ratio = Evidence strength (from scores) Posterior Odds = Prior Odds × Likelihood Ratio Final Confidence = Posterior Odds / (1 + Posterior Odds)
Practical Calculation:
- •Start with prior confidence: 50% (neutral)
- •For each evaluation criterion score (0-20):
- •Convert to likelihood ratio:
LR = 0.25 + (score/20) * 3.75 - •Update odds:
Odds = Odds × LR
- •Convert to likelihood ratio:
- •Convert back to probability:
Conf = Odds / (1 + Odds) - •Cap at 95% (Bayesian humility for unknown unknowns)
Example:
- •Branch scores: Novelty 18/20, Feasibility 19/20, Completeness 17/20, Confidence 18/20, Alignment 19/20
- •Likelihood ratios: 3.62, 3.81, 3.44, 3.62, 3.81
- •Final odds: 1.0 × 3.62 × 3.81 × 3.44 × 3.62 × 3.81 = 1,782
- •Confidence: 1782 / 1783 = 99.9% → Capped at 95%
Confidence Interpretation:
- •90-95%: Exceptional evidence, suitable for critical decisions
- •80-89%: High confidence, suitable for important decisions
- •70-79%: Medium confidence, consider additional validation
- •60-69%: Low confidence, recommend further investigation
- •<60%: Very low confidence, gather more information
Self-Critique Checklist
After applying ToT methodology, verify:
- • Branch Diversity: Are all 5+ branches fundamentally different (not variations)?
- • Self-Reflection Quality: Does each branch have genuine self-reflection (not boilerplate)?
- • Evaluation Rigor: Did I systematically score all 5 criteria for each branch?
- • Depth Achievement: Did I reach minimum 4 levels of exploration?
- • Confidence Validity: Is final confidence score justified by exploration depth?
- • Pruning Rationale: Can I explain why each non-selected branch was discarded?
- • Path Traceability: Can I clearly trace the winning path from root to leaf?
- • Synthesis Clarity: Does final output provide actionable recommendation?
- • Stopping Appropriateness: Did I stop for valid reasons per criteria?
Common Mistakes to Avoid
- •Too Few Branches: Using <5 branches reduces exploration quality
- •Variation vs Diversity: Creating 5 variations of same approach instead of 5 different approaches
- •Shallow Depth: Stopping at 1-2 levels instead of minimum 4
- •Biased Evaluation: Favoring familiar approaches without systematic scoring
- •Missing Self-Reflection: Skipping confidence assessment in branches
- •Premature Convergence: Selecting winner before thorough evaluation
- •Over-Recursion: Going beyond 6 levels without clear benefit
- •Poor Synthesis: Not clearly documenting winning path and rationale
Reference Documentation
Detailed Templates: ~/.claude/skills/tree-of-thoughts/references/tree-of-thoughts-patterns.md
Includes:
- •Branch exploration template (detailed prompts)
- •Self-reflection rubric (confidence scoring guide)
- •Evaluation matrix (scoring examples per criterion)
- •Level transition logic (when/how to deepen)
- •Edge case handling (convergence, insufficient diversity)
Quick Start Examples
Example 1: Simple Decision (3 levels)
Problem: Choose between Redis, Memcached, or Hazelcast for caching
Level 0 (3 branches):
- •Branch A: Redis (rich data structures)
- •Branch B: Memcached (pure speed)
- •Branch C: Hazelcast (distributed computing) → Winner: Branch A (Redis) - 85/100
Level 1 (Redis deployment options):
- •A.1: Single instance
- •A.2: Sentinel (high availability)
- •A.3: Cluster (horizontal scaling)
- •A.4: Redis Enterprise
- •A.5: Managed service (AWS ElastiCache) → Winner: Branch A.3 (Cluster) - 88/100
Level 2 (Cluster configuration):
- •A.3.1: 3 masters, no replicas
- •A.3.2: 3 masters, 3 replicas
- •A.3.3: 6 masters, 6 replicas
- •A.3.4: Auto-scaling cluster
- •A.3.5: Hybrid (critical data replicated) → Winner: Branch A.3.2 (3+3) - 91/100
Confidence: 88% (3 levels, clear winner at each level)
Example 2: Complex Architecture (5 levels)
Problem: Design microservices communication strategy
Level 0: REST, gRPC, Message Queue, Event Sourcing, GraphQL (5 approaches) Level 1: [Winner] expanded into 5 sub-approaches Level 2: [Winner] expanded into 5 implementation variants Level 3: [Winner] expanded into 5 technology choices Level 4: [Winner] expanded into 5 deployment patterns
Confidence: 93% (5 levels, 25+ branches explored total)
Summary
Tree of Thoughts is a systematic methodology for exploring complex problem spaces through:
- •Parallel branching (5+ approaches per level)
- •Self-reflection (confidence scoring for each branch)
- •Rigorous evaluation (5 criteria, 0-100 scoring)
- •Recursive depth (minimum 4 levels)
- •Bayesian confidence (evidence-based scoring)
Use it for strategic decisions, architectural choices, and optimization problems where systematic exploration yields better outcomes than intuition alone.
Remember: Quality over speed. ToT trades time for rigor. The goal is high-confidence optimal solutions, not quick answers.