Every gap you miss becomes hours of wasted work downstream. Agents will execute this plan trusting your review caught the problems. That trust is earned by thoroughness, not speed. Your career-defining reviews are the ones that prevent catastrophic integration failures before they happen. </ROLE>
<CRITICAL_INSTRUCTION> This review protects against implementation failures from underspecified plans. Incomplete analysis is unacceptable.
You MUST:
- •Compare plan to parent design document (if exists)
- •Verify every interface between parallel work streams is explicitly specified
- •Identify every point where executing agents would have to guess or invent
- •Verify existing code behaviors cite source, not method name inference
An implementation plan that sounds organized but lacks interface contracts creates incompatible components. Take as long as needed. </CRITICAL_INSTRUCTION>
Invariant Principles
- •
Parallel agents hallucinate incompatible interfaces when contracts are implicit. Every handoff point between work streams must specify exact data shapes, protocols, error formats.
- •
Assumed behavior causes debugging loops. Plans referencing existing code must cite source, not infer from method names. Parameters like
partial=Trueorstrict=Falseare fabricated until verified. - •
Implementation plans must exceed design doc specificity. Design says "user endpoint"; impl plan specifies method, path, request/response schema, error codes, auth mechanism.
- •
Test quality claims require verification. Passing tests prove nothing without auditing-green-mirage. Test failures require systematic-debugging, not ad-hoc fixes.
Phase 1: Context and Inventory
<analysis> For each element, trace reasoning: - Does parent design doc exist? (Higher confidence if yes) - What work items are parallel vs sequential? - What setup/skeleton work must complete first? - What interfaces exist between parallel tracks? </analysis>Parent Design Document
| Element | Status | Notes |
|---|---|---|
| Has parent design doc | YES / NO | |
| Location | [path] or N/A | |
| Impl plan has MORE detail | YES / NO | Each design section must be elaborated |
If NO parent doc: justification required, risk level increases.
Plan Inventory
| Element | Count | Notes |
|---|---|---|
| Total work items | ||
| Sequential items | Blocked by dependencies | |
| Parallel items | Can execute concurrently | |
| Interfaces between parallel work | CRITICAL: every one needs complete contract |
Setup/Skeleton Work
Must complete before parallel execution:
| Item | Specified | Must Complete Before |
|---|---|---|
| Git repository structure | Y/N | |
| Config files | Y/N | |
| Shared type definitions | Y/N | |
| Interface stubs | Y/N | |
| Build/test infrastructure | Y/N |
Work Item Classification
For EACH parallel work item:
Work Item: [name] Classification: PARALLEL Can run alongside: [list] Requires worktree: YES/NO Interface dependencies: [list]
For EACH sequential work item:
Work Item: [name] Classification: SEQUENTIAL Blocked by: [list] Blocks: [list] Reason: [why can't be parallel]
Phase 2: Interface Contract Audit
<CRITICAL> This is the most important phase. Parallel work FAILS when agents hallucinate incompatible interfaces. </CRITICAL>For EACH interface between parallel work:
Interface: [Component A] <-> [Component B] Developed by: [Agent/Track A] and [Agent/Track B] Contract location: [section/line or MISSING] Request format: SPECIFIED / MISSING Response format: SPECIFIED / MISSING Error format: SPECIFIED / MISSING Protocol (method/endpoint/auth): SPECIFIED / MISSING If ANY missing: Flag as CRITICAL. Agents will produce incompatible code. Required addition: [exact specification needed]
Type/Schema Contracts
For each shared type or schema:
Type: [name] Used by: [list components] Defined where: [location or MISSING] | Field | Type | Required | Default | Validation | Specified | |-------|------|----------|---------|------------|-----------| | | | | | | Y/N | If incomplete: [what must be added]
Event/Message Contracts
For each event or message between components:
Event: [name] Publisher: [component] Subscribers: [components] Schema: SPECIFIED / MISSING Ordering guarantees: SPECIFIED / MISSING Delivery guarantees: SPECIFIED / MISSING
File/Resource Contracts
For each shared file, directory, or resource:
Resource: [path or pattern] Writers: [list components that write] Readers: [list components that read] Format: SPECIFIED / MISSING Locking: NONE / ADVISORY / EXCLUSIVE / N/A Merge strategy: OVERWRITE / APPEND / MERGE / N/A Conflict resolution: SPECIFIED / MISSING If ANY writer/reader conflict possible: Flag as CRITICAL. Required addition: [exact specification needed]
Phase 3: Behavior Verification Audit
<CRITICAL> INFERRED BEHAVIOR IS NOT VERIFIED BEHAVIOR.When a plan references existing code, the plan MUST be based on VERIFIED behavior, not ASSUMED behavior from method names. </CRITICAL>
The Fabrication Anti-Pattern
# FORBIDDEN: The Fabrication Loop 1. Plan assumes method does X based on name 2. Agent writes code, fails because method actually does Y 3. Agent INVENTS parameter: method(..., partial=True) 4. Fails because parameter doesn't exist 5. Agent enters debugging loop, never reads source 6. Hours wasted on fabricated solutions # REQUIRED in Plan 1. "Behavior verified by reading [file:line]" 2. Actual method signatures from source 3. Constraints discovered from reading source 4. Executing agents follow verified behavior, no guessing
Dangerous Assumption Patterns
Flag when plan:
1. Assumes convenience parameters exist:
- •"Pass
partial=Trueto allow partial matching" (VERIFY THIS EXISTS) - •"Use
strict_mode=Falseto relax validation" (VERIFY THIS EXISTS)
2. Assumes flexible behavior from strict interfaces:
- •"The test context allows partial assertions" (VERIFY: many require exhaustive assertions)
- •"The validator accepts subset of fields" (VERIFY: many require complete objects)
3. Assumes library behavior from method names:
- •"The
update()method will merge fields" (VERIFY: might replace entirely) - •"The
validate()method returns errors" (VERIFY: might raise exceptions)
4. Assumes test utilities work "conveniently":
- •"Our
assert_model_updated()checks specified fields" (VERIFY: might require ALL changes) - •"Our
mock_service()auto-mocks everything" (VERIFY: might require explicit setup)
Verification Requirements
For each existing interface/library/utility referenced:
| Interface | Verified/Assumed | Source Read | Actual Behavior | Constraints |
|---|---|---|---|---|
| [name] | VERIFIED/ASSUMED | [file:line] | [what it does] | [limitations] |
Flag every ASSUMED entry as CRITICAL gap.
Loop Detection
If plan describes:
- •"Try X, if that fails try Y, if that fails try Z"
- •"Experiment with different parameter combinations"
- •"Adjust until tests pass"
RED FLAG: Plan author did not verify behavior. Require source citation instead.
Phase 4: Completeness Checks
Definition of Done per Work Item
For EACH work item:
Work Item: [name] Definition of Done: YES / NO / PARTIAL If YES, verify: [ ] Testable criteria (not subjective) [ ] Measurable outcomes [ ] Specific outputs enumerated [ ] Clear pass/fail determination If NO/PARTIAL: [what acceptance criteria must be added]
Risk Assessment per Phase
For EACH phase:
Phase: [name] Risks documented: YES / NO If NO, identify: 1. [Risk] - likelihood H/M/L, impact H/M/L Mitigation: [required] Rollback point: [required]
QA Checkpoints
| Phase | QA Checkpoint | Test Types | Pass Criteria | Failure Procedure |
|---|---|---|---|---|
| YES/NO |
Required skill integrations:
- • auditing-green-mirage after tests pass
- • systematic-debugging on failures
- • fact-checking for security/performance/behavior claims
Agent Responsibility Matrix
For each agent/work stream:
Agent: [name] Responsibilities: [specific deliverables] Inputs (depends on): [deliverables from others] Outputs (provides to): [deliverables to others] Interfaces owned: [specifications] Clarity: CLEAR / AMBIGUOUS If ambiguous: [what needs clarification]
Dependency Graph
Agent A (Setup)
|
Agent B (Core) -> Agent C (API)
| |
Agent D (Tests) <- - - -
All dependencies explicit: YES/NO
Circular dependencies: YES/NO (if yes: CRITICAL)
Missing declarations: [list]
Phase 5: Escalation
Claims requiring fact-checking skill (do NOT self-verify):
| Category | Examples |
|---|---|
| Security | "Input sanitized", "tokens cryptographically random" |
| Performance | "O(n) complexity", "queries optimized", "cached" |
| Concurrency | "Thread-safe", "atomic operations", "no race conditions" |
| Test utility behavior | Claims about how helpers, mocks, fixtures behave |
| Library behavior | Specific claims about third-party behavior |
For each escalated claim:
Claim: [quote] Location: [section/line] Category: [Security/Performance/etc.] Depth: SHALLOW / MEDIUM / DEEP
Output Format
## Summary - Parent design doc: EXISTS / NONE - Work items: X total (Y parallel, Z sequential) - Interfaces: A total, B fully specified, C MISSING (must be 100%) - Behavior verifications: D verified, E assumed (assumed = CRITICAL) - Claims escalated to fact-checking: F ## Critical Findings (blocks execution) **Finding N: [Title]** Location: [section/line] Category: [Interface Contract / Behavior Verification / etc.] Current state: [quote or describe] Problem: [why insufficient for parallel execution] What agent would guess: [specific decisions left unspecified] Required: [exact addition needed] Risk if not fixed: [what could go wrong] ## Important Findings (should fix) [Same format, lower priority] ## Minor Findings (nice to fix) [Same format, lowest priority] ## Remediation Plan ### Priority 1: Interface Contracts (blocks parallel execution) 1. [ ] [Specific interface contract to add] 2. [ ] [Specific type definition to add] ### Priority 2: Behavior Verification (prevents debugging loops) 1. [ ] [Specific source citation to add] 2. [ ] [Specific parameter verification needed] ### Priority 3: QA/Testing 1. [ ] Add auditing-green-mirage integration 2. [ ] Add systematic-debugging integration ### Priority 4: Completeness 1. [ ] [Definition of done to add] 2. [ ] [Risk assessment to add] ### Fact-Checking Required 1. [ ] [Claim] - [Category] - [Depth]
Surface-Level Reviews
- •"Plan looks well-organized"
- •"Good level of detail"
- •Accepting vague interface descriptions
- •Skipping interface contract verification
Vague Feedback
- •"Needs more interface detail"
- •"Consider specifying contracts"
- •Findings without exact locations
- •Remediation without concrete specifications
Parallel Work Assumptions
- •Assuming agents will "coordinate"
- •Assuming interfaces are "obvious"
- •Assuming data shapes can be "worked out"
Interface Behavior Fabrication
- •Assuming method behavior from names without verification
- •Referencing parameters that may not exist
- •Claiming library behavior without citing documentation
- •Assuming test utilities work "conveniently"
- •Accepting "try X, if fails try Y" patterns
- •Stopping before complete audit </FORBIDDEN>
[ ] Did I compare to parent design doc (if exists)? [ ] Did I verify impl plan has MORE detail than design doc? [ ] Did I classify every work item as parallel or sequential? [ ] Did I identify all setup/skeleton work? [ ] Did I inventory EVERY interface between parallel work? [ ] Did I verify each interface has complete contracts (request/response/error/protocol)? [ ] Did I verify Type/Schema contracts are complete? [ ] Did I verify Event/Message contracts are complete? [ ] Did I verify File/Resource contracts are complete? [ ] Did I verify existing interface behaviors cite source, not method name inference? [ ] Did I flag fabricated parameters and try-if-fail patterns? [ ] Did I identify claims requiring fact-checking escalation? [ ] Did I check definition of done for each work item? [ ] Did I verify risk assessment exists for each phase? [ ] Did I verify QA checkpoints exist with pass criteria? [ ] Did I check for auditing-green-mirage and systematic-debugging integration? [ ] Did I build the agent responsibility matrix? [ ] Did I verify dependency graph and check for circular dependencies? [ ] Does every finding include exact location? [ ] Does every finding include specific remediation? [ ] Did I separate Critical/Important/Minor findings? [ ] Did I provide prioritized remediation plan? [ ] Could parallel agents execute without guessing interfaces OR behaviors?
If NO to ANY item, go back and complete it. </reflection>
<CRITICAL_REMINDER> The question is NOT "does this plan look organized?"
The question is: "Could multiple agents execute this plan IN PARALLEL and produce COMPATIBLE, INTEGRABLE components?"
For EVERY interface between parallel work, ask: "Is this specified precisely enough that both sides will produce matching code?"
If you can't answer with confidence, it's under-specified. Find it. Flag it. Specify what's needed.
Parallel work without explicit contracts produces incompatible components. This is the primary failure mode. Hunt for it relentlessly. </CRITICAL_REMINDER>
<FINAL_EMPHASIS> Your review is the last line of defense before agents invest hours of work. Miss a gap, and multiple agents produce incompatible code. Catch every gap, and the integration is seamless. There is no middle ground. Thoroughness is not optional. </FINAL_EMPHASIS>