Spec Review
Audit and maintain specification documents for the current project.
When to Use
- •Before
/test-plan— clean specs produce better test plans - •After major implementation — specs may be stale
- •When adding a new component — ensure spec quality from day one
- •Periodic review — catch drift between spec and code
When NOT to Use
- •Code review — use
/verifyinstead - •Test gap analysis — use
/test-planinstead - •Writing new specs from scratch without existing design — draft first, then
/spec-review
Inputs
- •No argument: audit ALL specs in the project
- •File path: audit a specific spec file
- •
--fix: auto-fix trivial issues (formatting, dead links) without asking - •
--interactive: for each missing template section, ask structured questions instead of just reporting the gap. Uses AskUserQuestion to fill missing content collaboratively.
Workflow
Critical rule: Phase 1-4 are READ-ONLY. No files are modified until Phase 5. Premature fixes risk reversal (e.g., reclassifying a zone before understanding the design intent).
Phase 1: Discovery
- •Read
CLAUDE.mdfor project structure — find spec directory, source directories, and constraints - •Glob
{spec-dir}/**/*.mdto find all spec files - •Check for implementation status doc (if project has one)
- •Build a map: which specs exist, what components they cover
Phase 2: Per-File Audit
For each spec file, check all 7 quality criteria (see below). Evidence-based: every finding MUST cite spec file:line AND code file:line (or "no code found").
Phase 3: Cross-File Checks
- •SSOT: Detect duplicate content across spec files (same struct defined twice, same protocol described in two places)
- •Completeness: Every implemented module (check test dirs and source dirs) should have a corresponding spec section
- •Orphaned specs: Spec describes something that was deleted or never implemented
Phase 4: Report
Output the COMPLETE findings report. Do not fix anything yet. Present ALL findings grouped by severity so the user sees the full picture before any action:
Severity Decision Tree: Spec says X, code does Y? → STALE (highest priority) Spec says X but X is unclear? → AMBIGUOUS Spec doesn't mention feature Z? → INCOMPLETE Same info in file A and file B? → DUPLICATE Missing required template section? → TEMPLATE Formatting inconsistency? → STYLE (lowest priority)
Phase 4b: Interactive Fill (with --interactive)
After the full report is presented, for each TEMPLATE finding (missing required section):
- •Show which section is missing and WHY it matters
- •Ask the user a targeted question to fill the gap
- •Draft the section content from the user's answer
- •Collect all drafts — do NOT write yet (writing happens in Phase 5)
Example questions by section:
- •Zone: "What is the latency budget for [module]? Is it per-message (<10us), per-cycle (<500us), or event-driven (ms)?"
- •Performance Budget: "What are the per-stage timing targets? Which are measured vs design estimates?"
- •Testable Behaviors: "List the behaviors of [module] that a test should verify. E.g., 'push to full queue returns false'"
- •Error Handling: "What happens when [error scenario]? Does it return error code, log, or crash?"
Phase 5: Discussion & Fix
User has seen the full report. Now discuss and apply changes:
- •Walk through findings with user, confirm each proposed action
- •Spec fixes (STALE, AMBIGUOUS, TEMPLATE, STYLE): batch-apply after confirmation
- •Code findings (STALE_CODE, missing concept formalization, etc.): do NOT fix here — create as separate tasks or note in Open Items
- •For STYLE: auto-fix if
--fixflag, otherwise list - •For INCOMPLETE: suggest what to add, let user decide
Separation of concerns: spec-review modifies SPEC files only. Code changes are out of scope — they become tasks for implementation agents.
Quality Criteria
1. Clarity
Every spec section that describes behavior must be unambiguous enough to write a test from.
Check for:
- •Vague quantifiers without numbers: "fast", "low latency", "approximately", "around"
- •Missing units on performance claims: "~50" (50 what? ns? us? ms?)
- •Conditional behavior without specifying all branches: "if X, then Y" (what if not X?)
- •Undefined terms used without context
Evidence format:
AMBIGUOUS: {spec-file}:{line}
"{quoted text}"
-> Missing: {what's unclear and why it blocks testing}
2. Consistency with Code
Spec and code must be consistent. When they disagree, classify the disagreement — don't assume which side is right.
| Classification | Meaning | Action |
|---|---|---|
| STALE_SPEC | Code evolved intentionally | Update spec |
| STALE_CODE | Code has bug/shortcut, spec is design target | Flag for code fix |
| EVOLVED | Requirements changed | Update both |
| UNCLEAR | Can't determine | Present both, ask |
Check for:
- •Struct definitions in spec vs actual headers (field names, types, sizes, alignment)
- •Function signatures in spec vs actual code
- •Template parameters in spec vs actual code
- •Constants (capacities, limits) in spec vs actual code
- •Algorithms described in spec vs actual implementation
How to verify:
- •Read CLAUDE.md for source directory layout
- •Read the corresponding header/source files
- •Compare struct layouts, method signatures, constant values
- •Flag any mismatch with both locations
- •Determine WHY they disagree (commit history, design docs, code comments)
Evidence format:
DISAGREEMENT: {spec-file}:{line}
Spec: "{what spec says}"
Code: {source-file}:{line}
Mismatch: {what's different}
Classification: STALE_SPEC | STALE_CODE | EVOLVED | UNCLEAR
Reasoning: {why you classified it this way}
3. Single Source of Truth (SSOT)
Each piece of information should exist in exactly ONE spec file.
Check for:
- •Same struct described in multiple spec files
- •Same protocol described in multiple places
- •Performance targets stated in multiple files with different values
- •Cross-references that copy content instead of linking
Fix pattern: Keep the canonical definition in the more specific file, replace others with a cross-reference link.
4. Completeness
Every implemented component should have testable behaviors described in the spec.
Check for:
- •Implemented headers without corresponding spec sections
- •Spec sections that say "TBD", "TODO", or are empty
- •Components with tests but no spec (spec may need to be written retroactively)
- •Performance claims without benchmark references
- •Edge cases mentioned in code comments but not in spec
5. Currency
Spec reflects the CURRENT state, not historical or planned future state.
Check for:
- •"Will be implemented" / "planned" for things that are already done (update to present tense)
- •"Will be implemented" / "planned" for things in layers not yet built (acceptable, but mark clearly)
- •References to deleted code or renamed types
- •Version/date markers that are stale
6. Performance Claims
Every performance number must be grounded.
Grounding levels:
| Source | Example | Trust Level |
|---|---|---|
| Benchmark data | "P50=4.7ns (see reactor_overhead_bench)" | Measured |
| Hardware baseline | "L1 latency ~1ns (hw_cache_latency_bench)" | Measured |
| Design estimate | "~1.6 ns/loop (fold expression, single comparison)" | Theoretical |
| Spec assertion | "~50 ns read latency" | Ungrounded until measured |
Check for:
- •Performance claims without source: state whether it's a design estimate, measured value, or theoretical bound
- •Claims contradicted by benchmark data (check test reports and benchmark results)
- •Missing benchmark for a claimed number (no corresponding
*_bench.cpp)
7. Template Compliance
Every spec file must contain ALL required sections from the Spec Document Template (CLAUDE.md).
Check for:
- •Missing required sections (Overview, Zone Classification, Data Structures, Interface, Error Handling, Testable Behaviors, Dependencies)
- •HOT/WARM specs missing Performance Budget or Cache Footprint
- •Testable Behaviors section missing or empty (blocks
/test-plan) - •Zone Classification missing (blocks
/low-latency-auditscoping)
Evidence format:
TEMPLATE: {spec-file}
Missing: [list of missing required sections]
Zone: [classified zone or UNCLASSIFIED]
Output Format
# Spec Review Report **Scope**: [all specs | specific file] **Date**: YYYY-MM-DD HH:MM ## Summary | Severity | Count | |----------|-------| | STALE | N | | AMBIGUOUS | N | | INCOMPLETE | N | | DUPLICATE | N | | TEMPLATE | N | | STYLE | N | ## Findings ### STALE (fix immediately — misleading) 1. **file.md:LINE** — Description Code: header.h:LINE — What code actually does Proposed fix: ... ### AMBIGUOUS (discuss — blocks testing) 1. **file.md:LINE** — Description Impact: Cannot write test for [behavior] Proposed fix: ... [...etc for each severity...] ## Actions Taken - [x] Fixed: description (auto-fix) - [ ] Needs discussion: description
Principles
- •Disagreement is a finding, not an automatic fix direction: When spec and code disagree, classify first (STALE_SPEC / STALE_CODE / EVOLVED / UNCLEAR). Don't blindly update spec to match code — the code might be wrong.
- •Delete, don't archive: Remove stale content. Git has history.
- •Compress: A shorter spec that says the same thing is a better spec. Tables > prose. Remove filler.
- •No speculation: If you're not sure whether something is stale, read the code. If code doesn't exist, say so.
- •One pass, all criteria: Don't scan the file 7 times. Read once, check all criteria simultaneously.