<skill_overview> Claiming work is complete without verification is dishonesty, not efficiency. Evidence before claims, always. </skill_overview>
<rigidity_level> LOW FREEDOM - NO exceptions. Run verification command, read output, THEN make claim.
No shortcuts. No "should work". No partial verification. Run it, prove it. </rigidity_level>
<quick_reference>
| Claim | Verification Required | Not Sufficient |
|---|---|---|
| Tests pass | Run full test command, see 0 failures | Previous run, "should pass" |
| Build succeeds | Run build, see exit 0 | Linter passing |
| Bug fixed | Test original symptom, passes | Code changed |
| Task complete | Check all success criteria, run verifications | "Implemented bd-3" |
| Epic complete | bd list --status open --parent bd-1 shows 0 | "All tasks done" |
Iron Law: NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE
Use test-runner agent for: Tests, pre-commit hooks, commits (keeps verbose output out of context)
</quick_reference>
<when_to_use> ALWAYS before:
- •Any success/completion claim
- •Any expression of satisfaction
- •Committing, PR creation, task completion
- •Moving to next task
- •ANY communication suggesting completion/correctness
Red flags you need this:
- •Using "should", "probably", "seems to"
- •Expressing satisfaction before verification ("Great!", "Perfect!")
- •About to commit/push without verification
- •Trusting agent success reports
- •Relying on partial verification </when_to_use>
<the_process>
The Gate Function
Before claiming ANY status:
1. Identify
What command proves this claim?
2. Run
Execute the full command (fresh, complete).
For tests/hooks/commits: Use hyperpowers:test-runner agent
- •Agent captures verbose output in its context
- •Returns only summary + failures
- •Prevents context pollution
For other commands: Run directly and capture output
3. Read
Full output, check exit code, count failures.
4. Verify
Does output confirm the claim?
- •If NO: State actual status with evidence
- •If YES: State claim with evidence
5. Only Then
Make the claim.
Skip any step = lying, not verifying
</the_process>
<examples> <example> <scenario>Developer claims tests pass without running them</scenario> <code> Developer modifies authentication logic.Developer thinks: "This fix is straightforward, tests should pass now"
Developer writes: "Great! All tests passing. The bug is fixed."
[No test command run, no output shown] </code>
<why_it_fails> No evidence:
- •"Should pass" ≠ evidence
- •Confidence ≠ verification
- •Might have broken other tests
- •Might not have fixed the bug
Why dangerous:
- •Broken code ships
- •Trust broken with partner
- •Wastes time on false completion </why_it_fails>
# Dispatch test-runner agent "Run: cargo test"
Agent returns:
Summary: 33 passed, 1 failed
Failures:
- test_login_with_expired_token: assertion failed
Expected: Err(AuthError::TokenExpired)
Got: Ok(User { id: 123 })
State actual status:
Tests: 33 passed, 1 failed. Failure: test_login_with_expired_token still fails. The fix didn't handle expired tokens. Investigating...
Result: Accurate status, bug caught before committing. </correction> </example>
<example> <scenario>Developer closes bd task without verifying success criteria</scenario> <code> bd-3 success criteria: - [ ] All functions fully implemented (no stubs, no TODOs) - [ ] Tests written and passing: test_generate_token, test_verify_token - [ ] Pre-commit hooks passDeveloper implements token functions.
Developer thinks: "I implemented everything, task complete"
Runs: bd close bd-3
[No verification commands run] </code>
<why_it_fails> Success criteria not verified:
- •Might have TODO comments left
- •Specific tests not run
- •Pre-commit hooks not checked
Result: Incomplete task closed, issues discovered later. </why_it_fails>
<correction> **Verify each criterion:**# 1. Check for TODOs rg "TODO|FIXME|todo!|unimplemented!" src/ # Output: [nothing found] ✅ # 2. Run specific tests cargo test test_generate_token test_verify_token # Output: test result: ok. 2 passed ✅ # 3. Run pre-commit hooks (via test-runner agent) "Run: cargo fmt --check && cargo clippy && cargo test"
Agent returns:
All checks passed: - cargo fmt: no formatting issues - cargo clippy: no warnings - cargo test: 34/34 passed
THEN close task:
bd close bd-3
Result: Task actually complete, all criteria met with evidence. </correction> </example>
<example> <scenario>Developer claims build works based on linter passing</scenario> <code> Developer fixes TypeScript errors flagged by linter.Runs: npm run lint Output: ✅ No issues found
Developer writes: "Build passes! Ready to commit."
[No build command run] </code>
<why_it_fails> Linter ≠ compiler:
- •Linter checks style and simple errors
- •Doesn't verify imports, types, compilation
- •Build might still fail
Why it happens: Partial verification, extrapolation. </why_it_fails>
<correction> **Run actual build:**npm run build
Output:
ERROR in src/auth/service.ts:23:15 TS2304: Cannot find name 'validateToken' Did you mean 'verifyToken'?
State actual status:
Linter passes, but build fails. Error: validateToken not defined (should be verifyToken). Fixing...
After fix, run build again:
npm run build # Output: ✓ built in 1.2s
Now can claim:
Build passes. [Evidence: npm run build exit 0] Ready to commit.
Result: Actual build status verified, error caught. </correction> </example>
</examples><critical_rules>
Rules That Have No Exceptions
- •
No claims without fresh verification → Run command, see output, THEN claim
- •"Should work" = forbidden
- •"Looks correct" = forbidden
- •Previous run ≠ fresh verification
- •
Use test-runner agent for verbose commands → Tests, hooks, commits
- •Prevents context pollution
- •Returns summary + failures only
- •Never run
git commitorcargo testdirectly if output is verbose
- •
Verify ALL success criteria → Not just "tests pass"
- •Read each criterion from bd task
- •Run verification for each
- •Check all pass before closing
- •
Evidence in every claim → Show the output
- •Not: "Tests pass"
- •Yes: "Tests pass [Ran: cargo test, Output: 34/34 passed]"
Common Excuses
All of these mean: Stop, run verification:
- •"Should work now"
- •"I'm confident this fixes it"
- •"Just this once"
- •"Linter passed" (when claiming build works)
- •"Agent said success" (without independent verification)
- •"I'm tired" (exhaustion ≠ excuse)
- •"Partial check is enough"
Pre-Commit Hook Assumption
If your project uses pre-commit hooks enforcing tests:
- •All test failures are from your current changes
- •Never check if errors were "pre-existing"
- •Don't run
git checkout <sha> && pytestto verify - •Pre-commit hooks guarantee previous commit passed
- •Just fix the error directly
</critical_rules>
<verification_checklist>
Before claiming tests pass:
- • Ran full test command (not partial)
- • Saw output showing 0 failures
- • Used test-runner agent if output verbose
Before claiming build succeeds:
- • Ran build command (not just linter)
- • Saw exit code 0
- • Checked for compilation errors
Before closing bd task:
- • Re-read success criteria from bd task
- • Ran verification for each criterion
- • Saw evidence all pass
- • THEN closed task
Before closing bd epic:
- • Ran
bd list --status open --parent bd-1 - • Saw 0 open tasks
- • Ran
bd dep tree bd-1 - • Confirmed all tasks closed
- • THEN closed epic
</verification_checklist>
<integration>This skill calls:
- •test-runner (for verbose verification commands)
This skill is called by:
- •test-driven-development (verify tests pass/fail)
- •executing-plans (verify task success criteria)
- •refactoring-safely (verify tests still pass)
- •ALL skills before completion claims
Agents used:
- •hyperpowers:test-runner (run tests, hooks, commits without output pollution)
When stuck:
- •Tempted to say "should work" → Run the verification
- •Agent reports success → Check VCS diff, verify independently
- •Partial verification → Run complete command
- •Tired and want to finish → Run verification anyway, no exceptions
Verification patterns:
- •Tests: Use test-runner agent, check 0 failures
- •Build: Run build command, check exit 0
- •bd task: Verify each success criterion
- •bd epic: Check all tasks closed with bd list/dep tree