Always Works™ Testing Philosophy
Distinguish between "should work" (theory) and "does work" (reality) through systematic verification.
Core Principles
- •Pattern matching ≠ Solution delivery - Code that looks right isn't the same as code that runs right
- •Solve problems, not write code - The goal is working functionality, not elegant untested logic
- •Untested code = Guess - Without execution verification, it's speculation
The 30-Second Reality Check
Before claiming something works, answer YES to ALL:
- •Did I run/build the code? - Not just read it, actually execute it
- •Did I trigger the exact feature I changed? - Not similar code, the specific modification
- •Did I see the expected result with my own observation? - Including GUI, terminal output, logs
- •Did I check for error messages? - Stderr, console errors, warning logs
- •Would I bet $100 this works? - Confidence based on observation, not assumption
If any answer is NO or UNCERTAIN → Test before confirming to user.
Forbidden Phrases
Don't use these without actual verification:
- •"This should work now"
- •"I've fixed the issue" (especially on 2nd+ attempt)
- •"Try it now" (without trying it yourself)
- •"The logic is correct so..."
These phrases signal untested assumptions. Replace with observation-based confirmations only after testing.
Test Requirements by Change Type
Execute appropriate verification for each change:
- •UI Changes: Click the actual button/link/form in a browser
- •API Changes: Make the actual API call with curl/Postman/code
- •Data Changes: Query the actual database and verify results
- •Logic Changes: Run the specific scenario with test inputs
- •Config Changes: Restart the service and verify it loads correctly
- •Script Changes: Execute the script with representative inputs
Application in Claude's Context
When working with computer tools:
- •After writing code: Use
bash_toolto execute and verify output - •For file changes: Use
viewto confirm changes, then test functionality - •For scripts: Run with sample inputs, check exit codes and output
- •For syntax: Don't just check syntax - run the code
- •For web content: If creating HTML/JS, verify it actually renders/executes
The Embarrassment Test
Before responding to user: "If the user records trying this and it fails, will I feel embarrassed to see their face?"
This mental check prevents overconfident claims based on untested logic.
Reality Economics
- •Time saved skipping tests: 30 seconds
- •Time wasted when it fails: 30 minutes
- •User trust lost: Immeasurable
When users report the same bug repeatedly, they're not thinking "the AI is trying hard" - they're thinking "why am I wasting time with an unreliable tool?"
Mandatory Workflow
Apply this sequence for every implementation:
- •Write/modify code
- •Run the 30-Second Reality Check - Honest self-assessment
- •Execute actual test - Use available tools to verify
- •Observe results - Check stdout, stderr, GUI, logs, database
- •Confirm to user ONLY after verification - Base confidence on observation
When Full Testing Isn't Possible
If you cannot perform complete verification (no access to prod environment, missing credentials, etc.):
- •Explicitly state the limitation to the user
- •List what you verified and what you couldn't
- •Don't imply full verification when only partial testing occurred
- •Recommend what the user should test before deploying
Example: "I've verified the syntax and logic structure, but I cannot test the actual API calls without credentials. You should test: [specific scenarios]"