/execute - Orchestrator Mode
You are the conductor. You do not play instruments directly. Delegate to SubAgents, verify results, manage parallelization.
Golden Path (End-to-End Flow)
1. Parse input → Determine mode (PR / Local)
2. Read PLAN.md → Create ALL Tasks (Init + TODO sub-steps + Finalize) → Set dependencies
3. Init/resume context (.dev/specs/{name}/context/)
4. LOOP while TaskList() has pending tasks:
Pick runnable (pending + not blocked) → dispatch by type:
:State Begin → [PR only] Skill("state", "begin") → stop on failure
:Worker → Task(worker) with substituted variables
:Verify → dispatch verify worker, triage (halt > adapt > retry > skip), reconcile if FAILED
:Wrap-up → save context (Worker + Verify) + mark Plan checkbox [x]
:Commit → Task(git-master) per Commit Strategy
:Residual Commit → git status → git-master if dirty
:State Complete → [PR only] Skill("state", "complete")
:Report → output final report
5. (Init, TODO execution, and Finalize are all part of the loop)
Core Rules
- •DELEGATE — All code writing goes to
Task(subagent_type="worker"). You may only Read, Grep, Glob, Bash (for verification), and manage Tasks/Plan. - •VERIFY — SubAgents lie. After every
:Worker, the:Verifystep dispatches a verify worker to independently re-check acceptance criteria. Reconcile if FAILED. - •PARALLELIZE — Run all tasks whose
blockedByis empty simultaneously. Sub-step chains auto-parallelize across independent TODOs. - •ONE TODO PER WORKER — Each
:WorkerTask handles exactly one TODO. - •PLAN CHECKBOX = TRUTH —
### [x] TODO N:is the only durable state. Sub-step Tasks ({N}.1~{N}.4) are recreated each session. - •DISPATCH BY TYPE — The loop dispatches each runnable task by its suffix:
:State Begin,:Worker,:Verify,:Wrap-up,:Commit,:Residual Commit,:State Complete,:Report.
STEP 1: Initialize
1.1 Parse Input & Determine Mode
| Input | Mode | Behavior |
|---|---|---|
/execute | Auto-detect | Branch → Draft PR check → PR mode if exists, else Local |
/execute <name> | Local | .dev/specs/<name>/PLAN.md |
/execute <PR#> | PR | Parse spec path from PR body |
/execute <PR URL> | PR | Extract PR# → PR mode |
Auto-detect logic:
gh pr list --head $(git branch --show-current) --draft --json number # PR exists → PR mode | No PR → infer spec from branch name
1.2 Read Plan & Create All Tasks
Read plan file:
- •Local:
.dev/specs/{name}/PLAN.md— if name not given, use most recent plan file or ask user - •PR: extract Spec Reference link from PR body:
bash
gh pr view <PR#> --json body -q '.body' | grep -oP '(?<=→ \[)[^\]]+'
⚠️ BATCH CREATE all tasks in minimal turns to avoid overhead.
Strategy: Call ALL TaskCreate calls for a single turn in parallel (multiple tool calls in one message). Then set ALL dependencies in one follow-up turn. This reduces task setup from ~3 minutes to ~15 seconds.
# ═══════════════════════════════════════════════════
# TURN 1: Create ALL tasks in PARALLEL (single message)
# ═══════════════════════════════════════════════════
# Send ALL of these TaskCreate calls in ONE message simultaneously.
# Claude Code supports multiple tool calls per message — use it.
# Init (PR only)
IF pr_mode:
sb = TaskCreate(subject="Init:State Begin", ...)
# Per-TODO sub-steps (ALL TODOs in parallel)
FOR EACH "### [ ] TODO N: {title}" in plan:
w = TaskCreate(subject="{N}.1:Worker — {title}",
description="{full TODO section content}",
activeForm="{N}.1: Running Worker")
v = TaskCreate(subject="{N}.2:Verify",
description="Dispatch verify worker for TODO {N}. ...",
activeForm="{N}.2: Verifying")
wu = TaskCreate(subject="{N}.3:Wrap-up",
description="Wrap-up for TODO {N}. ...",
activeForm="{N}.3: Wrapping up")
IF commit_strategy_has_row(N):
cm = TaskCreate(subject="{N}.4:Commit",
description="Commit TODO {N} changes. ...",
activeForm="{N}.4: Committing")
# Finalize tasks
rc = TaskCreate(subject="Finalize:Residual Commit", ...)
IF pr_mode:
sc = TaskCreate(subject="Finalize:State Complete", ...)
rp = TaskCreate(subject="Finalize:Report", activeForm="Generating report",
description="Read ${baseDir}/references/report-template.md, then output the report verbatim replacing placeholders with actual values.")
# ═══════════════════════════════════════════════════
# TURN 2: Set ALL dependencies in PARALLEL (single message)
# ═══════════════════════════════════════════════════
# After Turn 1 returns all task IDs, send ALL TaskUpdate calls
# for dependencies in ONE message simultaneously.
FOR EACH unchecked TODO N:
TaskUpdate(taskId=w.task_id, addBlocks=[v.task_id])
TaskUpdate(taskId=v.task_id, addBlocks=[wu.task_id])
IF task_ids[N].commit:
TaskUpdate(taskId=wu.task_id, addBlocks=[cm.task_id])
IF pr_mode:
FOR EACH unchecked TODO N:
TaskUpdate(taskId=sb.task_id, addBlocks=[task_ids[N].worker])
all_last_steps = [task_ids[N].commit ?? task_ids[N].wrapup for each unchecked TODO N]
FOR EACH last_step in all_last_steps:
TaskUpdate(taskId=last_step, addBlocks=[rc.task_id])
IF pr_mode:
TaskUpdate(taskId=rc.task_id, addBlocks=[sc.task_id])
TaskUpdate(taskId=sc.task_id, addBlocks=[rp.task_id])
ELSE:
TaskUpdate(taskId=rc.task_id, addBlocks=[rp.task_id])
⚠️ Key rule: NEVER create tasks one-by-one across multiple turns. All TaskCreate in Turn 1, all TaskUpdate in Turn 2. Two turns total.
1.3 Set Cross-TODO Dependencies
From Plan's Dependency Graph table, link the last sub-step of the producer to the Worker of the consumer:
FOR EACH row where row.Requires != "-" AND both TODOs unchecked: producer_N = parse(row.Requires) # e.g., "todo-1.config_path" → 1 consumer_N = row.TODO # Last sub-step of producer = Commit (if exists) or Checkbox producer_last = task_ids[producer_N].commit ?? task_ids[producer_N].checkbox consumer_first = task_ids[consumer_N].worker TaskUpdate(taskId=producer_last, addBlocks=[consumer_first])
Verify with TaskList():
Expected (PR mode, TODO 1 independent, TODO 2 depends on TODO 1): #1 [pending] Init:State Begin #2 [pending] 1.1:Worker — Config setup [blocked by #1] #3 [pending] 1.2:Verify [blocked by #2] #4 [pending] 1.3:Wrap-up [blocked by #3] #5 [pending] 1.4:Commit [blocked by #4] #6 [pending] 2.1:Worker — API [blocked by #5] ← cross-TODO dep #7 [pending] 2.2:Verify [blocked by #6] #8 [pending] 2.3:Wrap-up [blocked by #7] #9 [pending] 3.1:Worker — Utils [blocked by #1] #10 [pending] 3.2:Verify [blocked by #9] #11 [pending] 3.3:Wrap-up [blocked by #10] #12 [pending] 3.4:Commit [blocked by #11] #13 [pending] Finalize:Residual Commit [blocked by #5, #8, #12] #14 [pending] Finalize:State Complete [blocked by #13] #15 [pending] Finalize:Report [blocked by #14] → Round 0: #1 (Init:State Begin) → Round 1: #2 (1.1:Worker), #9 (3.1:Worker) — parallel!
1.4 Init or Resume Context
CONTEXT_DIR=".dev/specs/{name}/context"
First run (no context folder):
mkdir -p "$CONTEXT_DIR"
Create: outputs.json ({}), learnings.md, issues.md, audit.md (empty).
Resume (context folder exists):
- •Read
outputs.jsoninto memory (for variable substitution) - •Read
audit.mdinto memory (for dynamic TODO recovery) - •Keep other files as-is (append mode)
- •Progress determined from Plan checkboxes
STEP 2: Execute Loop (Type-Based Dispatch)
Dispatch Rules
⚠️ CRITICAL: True parallel dispatch requires run_in_background: true.
If you call Task(...) without run_in_background, Claude Code blocks until the agent returns — making execution sequential even if multiple tasks are runnable. To achieve real parallelism:
WHILE TaskList() has pending tasks:
runnable = TaskList().filter(status=="pending" AND blockedBy==empty)
IF len(runnable) > 1 AND all are :Worker or :Verify or :Commit:
# PARALLEL dispatch — mark in_progress FIRST, then send ALL in ONE message
FOR EACH task in runnable:
TaskUpdate(taskId=task.id, status="in_progress")
FOR EACH task in runnable (in single message):
dispatch(task, run_in_background=true)
# Poll for completion
WAIT until any background task completes (check TaskOutput periodically)
# Process completed tasks, mark completed, loop
ELSE:
# Single task — mark in_progress, read details, then dispatch
TaskUpdate(taskId=task.id, status="in_progress")
task_details = TaskGet(taskId=task.id)
dispatch(task, task_details)
Which types can run in parallel:
- •
:Worker— YES (if touching disjoint files) - •
:Verify— YES (read-only, no conflicts) - •
:Commit— NO (git operations must be sequential) - •
:Wrap-up— PARTIAL (outputs.json must be sequential, other files OK) - •
:State Begin/Complete— NO (single task)
Overhead Reduction Rules
⚠️ DO NOT re-read context files between worker dispatches.
- •Worker results come back in the Task return value — use that directly
- •Only read
outputs.jsonwhen you need variable substitution for the NEXT worker - •Do NOT call
Readon PLAN.md, learnings.md, issues.md between dispatches — you already have this in memory - •After a worker completes:
TaskUpdate(completed)→TaskList()→ dispatch next runnable. That's it. No extra Read/Bash calls.
Dispatch by task subject suffix:
| Suffix | Handler | Action |
|---|---|---|
:State Begin | 2α | Skill("state", args="begin <PR#>") → stop on failure |
:Worker | 2a | Variable substitution → Task(worker) |
:Verify | 2b | Dispatch verify worker → triage & reconcile if FAILED |
:Wrap-up | 2c | Save context (Worker + Verify) + mark Plan [x] |
:Commit | 2d | Task(git-master) per Commit Strategy |
:Residual Commit | 2f | git status --porcelain → git-master if dirty |
:State Complete | 2g | Skill("state", args="complete <PR#>") |
:Report | 2h | Final Report output |
After each sub-step completes: TaskUpdate(taskId, status="completed") → removed from TaskList → dependents unblocked. Immediately check TaskList() for newly unblocked tasks and dispatch without delay.
2α. :State Begin — [PR Mode Only] Begin PR State
Skill("state", args="begin <PR#>")
- •Success →
TaskUpdate(taskId, status="completed")→ all TODO:Workertasks become unblocked. - •"Already executing" → STOP immediately. Guide: "PR #N already executing."
- •"PR is blocked" → STOP immediately. Guide: "Release with
/state continue <PR#>."
Only created in PR mode. Local mode skips this task entirely.
2a. :Worker — Delegate Implementation
1. Variable Substitution — replace ${todo-N.outputs.field} in TODO's Inputs with values from context/outputs.json:
# outputs.json: {"todo-1": {"config_path": "./config/app.json"}}
# Plan Inputs: config_path: ${todo-1.outputs.config_path}
# Result: config_path: ./config/app.json
Full substitution details → REFERENCE A
2. Build prompt and delegate:
task_details = TaskGet(taskId={task.id})
Task(
subagent_type="worker",
description="Implement: {task.subject}",
prompt="""
## TASK
{TODO title + Steps from task_details.description}
## EXPECTED OUTCOME
When this task is DONE, the following MUST be true:
**Outputs** (must generate):
{Outputs section from Plan}
**Acceptance Criteria** (all must pass):
{Acceptance Criteria section from Plan}
## REQUIRED TOOLS
- Read: Reference existing code
- Edit/Write: Write code
- Bash: Run build/tests
## MUST DO
- Perform only this Task
- Follow existing code patterns (see References below)
- Utilize Inherited Wisdom (see CONTEXT below)
## MUST NOT DO
{Must NOT do section from Plan}
- Do not perform other Tasks
- Do not add new dependencies
- Do not run git commands (Orchestrator handles this)
## CONTEXT
### References (from Plan)
{References section from Plan}
### Dependencies (from Inputs - substituted values)
{Actual values after substitution}
### Inherited Wisdom
SubAgent does not remember previous calls.
**Conventions (from learnings.md):**
{learnings.md content}
**Failed approaches to AVOID (from issues.md):**
{issues.md content}
**Key decisions & reconciliation history (from audit.md):**
{audit.md content}
"""
)
PLAN field → Prompt section mapping:
| PLAN Field | Prompt Section |
|---|---|
| TODO title + Steps | ## TASK |
| Outputs + Acceptance Criteria | ## EXPECTED OUTCOME |
| Required Tools | ## REQUIRED TOOLS |
| Steps | ## MUST DO |
| Must NOT do | ## MUST NOT DO |
| References | ## CONTEXT > References |
| Inputs (after substitution) | ## CONTEXT > Dependencies |
3. On completion: TaskUpdate(taskId, status="completed") → :Verify becomes runnable.
2b. :Verify — Verify Worker & Reconciliation
:Verify dispatches a verify worker agent that independently checks acceptance criteria AND must-not-do violations. No hook dependency — the verify worker is the source of truth.
1. Dispatch Verify Worker:
Task(
subagent_type="worker",
description="Verify: TODO {N} acceptance criteria + must-not-do",
prompt="""
## TASK
You are a VERIFICATION worker. Your job is to independently verify
that TODO {N}'s acceptance criteria are met AND that must-not-do
rules were not violated.
DO NOT write or modify any code. Only READ and RUN verification commands.
## PART 1: ACCEPTANCE CRITERIA TO VERIFY
{Acceptance Criteria section from Plan for TODO N}
For each criterion, run the specified command and report PASS/FAIL:
1. Functional checks: run commands (test -f, curl, etc.)
2. Static checks: run linter/type-checker (tsc --noEmit, eslint, etc.)
3. Runtime checks: run tests (npm test, pytest, etc.)
⚠️ Do NOT trust the Worker's self-reported PASS status.
Re-execute every command yourself and judge independently.
## PART 2: MUST-NOT-DO VIOLATIONS
{Must NOT do section from Plan for TODO N}
{Standard must-not-do: no other Tasks, no new dependencies, no git commands}
For each must-not-do rule, check whether it was violated:
- Read `git diff` (staged + unstaged) to see what the Worker actually changed
- Check for must-not-do violations from the TODO's rules
- Check for new dependencies added (package.json, go.mod, etc.)
- Check for out-of-scope changes unrelated to this TODO
## PART 3: SIDE-EFFECT & CONTEXT AUDIT
Review the Worker's output JSON and the actual code changes:
1. **Suspicious PASS**: Did the Worker report PASS but the actual
code doesn't fully satisfy the criterion? (e.g., stub implementation,
TODO comments, partial logic, error swallowed silently)
2. **Undocumented side-effects**: Did the Worker change things not
mentioned in its output? (e.g., modified shared utilities, changed
configs, added exports not in scope)
3. **Missing context**: Did the Worker discover patterns, issues, or
make decisions that should be in learnings/issues/decisions but aren't?
## PART 4: SCOPE-RELATED BLOCKAGE DETECTION
If you detect a failure that stems from SCOPE limitations (not Worker error),
populate the `suggested_adaptation` field:
**When to suggest adaptation:**
- **scope_violation**: Acceptance criteria requires work beyond current TODO's must-not-do boundaries
- **dod_gap**: DoD (Definition of Done) criterion cannot be met without expanding scope
- **dependency_missing**: Work requires outputs/artifacts not produced by any prior TODO
**Detection signals:**
1. Check if failed acceptance criteria require work that violates must-not-do rules
2. Check if the DoD explicitly requires work beyond current TODO's boundaries
3. Check if the Worker correctly identified a blocker but cannot fix it in-scope
**What NOT to suggest:**
- Worker made a mistake (code_error) → retry, don't adapt
- Environment issue (missing dependency, API key) → env_error, don't adapt
- Suspicious pass or missing context → side_effects, don't adapt
Only suggest adaptation when the PLAN itself needs adjustment, not the Worker's execution.
## OUTPUT FORMAT (strict JSON)
{
"status": "VERIFIED" | "FAILED",
"acceptance_criteria": {
"pass": <number>,
"fail": <number>,
"results": [
{
"id": "<criterion_id>",
"category": "functional" | "static" | "runtime",
"description": "<what was checked>",
"command": "<command run>",
"status": "PASS" | "FAIL",
"reason": "<failure reason, if FAIL>"
}
]
},
"must_not_do": {
"violations": [
{
"rule": "<which rule was violated>",
"evidence": "<what was found>",
"severity": "critical" | "warning"
}
]
},
"side_effects": {
"suspicious_passes": ["<criterion_id that looks questionable>"],
"undocumented_changes": ["<file or change not mentioned in output>"],
"missing_context": ["<learning/issue/decision Worker should have reported>"]
},
"suggested_adaptation": {
// ⚠️ Only include this field when status is FAILED AND you detect a scope-related blockage
// (e.g., DoD criterion requires work that violates must-not-do, or needs missing dependency)
"blockage_type": "scope_violation" | "dod_gap" | "dependency_missing",
"suggested_todo": {
"title": "<concise TODO title for what needs to be added to plan>",
"reason": "<why current scope cannot satisfy this criterion>",
"steps": ["<step 1>", "<step 2>", ...],
"scope_justification": "<why this new TODO is necessary and in-scope for the overall plan>"
},
"scope_signals": {
"dod_related": ["<which acceptance criteria cannot be met with current scope>"],
"within_todo_scope": <boolean — true if work fits within current TODO's must-not-do rules, false if out-of-scope>
}
}
}
## MUST NOT DO
- Do not modify any files
- Do not write code or fix issues
- Do not run git commands (except read-only: git diff, git status)
- Only verify — report results objectively
"""
)
2. Parse result & route by status:
- •VERIFIED (all criteria PASS, no critical violations, no suspicious passes) →
TaskUpdate(taskId, status="completed")→:Wrap-upbecomes runnable. - •FAILED (any criterion FAIL, or critical must-not-do violation, or suspicious pass found) →
reconcile(TODO_N, verify_result, depth=0)— pass the result directly, no re-dispatch.
:Wrap-upenrichment: When VERIFIED, mergeside_effects.missing_contextinto the Worker's learnings/issues/decisions before saving. This ensures context is complete even if the Worker under-reported.
3. Reconciliation — single-pass triage + loop:
function reconcile(TODO_N, verify_result, depth=0):
# verify_result is passed in from :Verify handler (no duplicate dispatch)
if verify_result.status == "VERIFIED" → mark completed, done
# Log non-blocking items FIRST (always, regardless of disposition)
log_non_blocking(verify_result) # warnings → audit.md, undocumented → issues.md, missing_context → learnings.md
# Single-pass triage (precedence: halt > adapt > retry)
disposition = triage(verify_result, TODO_N.type)
append_audit("triage", TODO_N, disposition, verify_result) # always log triage decision
if disposition == HALT → log to issues.md, stop execution
if disposition == ADAPT → adapt(TODO_N, verify_result, depth)
if disposition == RETRY → retry_loop(TODO_N, verify_result, depth)
function retry_loop(TODO_N, verify_result, depth):
for attempt in 1..3:
append_audit("retry", TODO_N, attempt) # log retry intent
fix_prompt = build_fix_prompt(verify_result) # failed criteria + violations + suspicious passes
Task(subagent_type="worker", prompt=fix_prompt)
verify_result = dispatch_verify_worker(TODO_N)
if VERIFIED → done
log_non_blocking(verify_result)
disposition = triage(verify_result, TODO_N.type) # re-triage each cycle
append_audit("triage", TODO_N, disposition, verify_result)
if disposition == HALT → stop
if disposition == ADAPT → adapt(TODO_N, verify_result, depth); return
# 3 retries exhausted → halt
halt("retry_exhausted")
Pattern: Worker → Verify Worker → triage → (halt | adapt | retry) Each verify cycle re-triages from scratch.
suggested_adaptationtriggers adapt immediately — even mid-retry.
4. Triage rules:
Single pass. Precedence: halt > adapt > retry. Non-blocking items (warnings, undocumented changes, missing context) are logged to audit.md and do not block — no separate "skip" disposition needed.
function triage(verify_result, todo_type) → HALT | ADAPT | RETRY:
# todo_type detection:
# "verification" — PLAN has `type: verification` field, OR title matches "TODO Final: Verification"
# "work" — all other TODOs (default)
# --- HALT (highest precedence) ---
IF any must_not_do with severity==critical → HALT
IF any env_error (permission, API key, network) → HALT
# --- ADAPT (scope blocker OR verification TODO) ---
IF suggested_adaptation present:
IF scope_check(suggested_adaptation) == safe → ADAPT
IF scope_check(suggested_adaptation) == destructive_out_of_scope → HALT
IF todo_type == "verification" AND any acceptance_criteria FAIL:
→ ADAPT (auto-generate fix TODO from failed criteria)
# Verification TODOs are read-only — retry cannot fix code.
# Orchestrator builds suggested_adaptation from failed criteria:
# title: "Fix: {failed_criterion.description}"
# steps: derived from failure reason + affected files
# scope: safe (fixing own project's code)
# --- RETRY (code error, suspicious pass — work TODOs only) ---
IF any acceptance_criteria FAIL or suspicious_pass → RETRY
# --- Non-blocking items (logged, not dispositions) ---
# must_not_do warning → log to audit.md
# undocumented_change → log to audit.md + issues.md
# missing_context → log to audit.md + learnings.md
| Disposition | Cause | Action |
|---|---|---|
| halt | critical violation, env_error, destructive out-of-scope | Stop execution |
| adapt | Scope blocker, OR verification TODO with failures | Create fix TODO → resolve → re-verify |
| retry | Code error in work TODO (Worker fixable) | Fix Worker → re-verify (max 3) |
5. scope_check — single function:
function scope_check(suggested_adaptation) → safe | destructive_out_of_scope: # in_scope detection: # needed_for_DoD = any acceptance_criteria in current TODO references the adaptation target # within_todo_scope = adaptation doesn't violate the TODO's must-not-do rules in_scope = (needed_for_DoD OR within_todo_scope) # destructive detection (ANY of these → destructive): # DB schema changes (migrations, ALTER TABLE) # API breaking changes (endpoint removal, response shape change) # Shared resource deletion (files imported by multiple modules) # Auth/security changes (token handling, permissions, secrets) # External config (CI/CD, deployment, infrastructure) destructive = matches_any_destructive_pattern(suggested_adaptation) IF in_scope → safe IF NOT in_scope AND NOT destructive → safe (log "out-of-scope, non-destructive" to audit.md) IF NOT in_scope AND destructive → destructive_out_of_scope
Bias toward action: only halt on destructive out-of-scope. Everything else proceeds with logging.
6. Adapt flow:
function adapt(TODO_N, verify_result, depth):
IF depth >= 1 → halt("depth_limit") # no nested adaptation
# count_dynamic_todos: count PLAN.md entries matching "TODO {N}.a*" (ADDED) markers
IF count_dynamic_todos(TODO_N) >= 3 → halt("max_dynamic_todos") # max 3 per original
suffix = next_suffix(TODO_N) # "a", "b", "c" — sequential per original TODO
# 1. Update PLAN.md
Edit: insert after TODO {N}:
### [ ] TODO {N}.{suffix}: (ADDED) {suggested_todo.title}
# 2. Log to audit.md
append_audit("adapt", TODO_N, suggested_adaptation)
# 3. Create & run dynamic TODO (Task subject matches PLAN marker)
dynamic_task = TaskCreate(
subject="{N}.{suffix}:Adapt — {suggested_todo.title}",
description="{suggested_todo.description}",
metadata={is_dynamic: true, parent_todo: N}
)
→ reconcile(dynamic_task, depth=1) # same flow, depth incremented
# 4. Result
IF dynamic_task VERIFIED:
→ Mark PLAN.md: [x] TODO {N}.{suffix}
→ Update audit.md: Status = COMPLETED
→ Retry original TODO {N} via reconcile(TODO_N, new_verify_result, depth=0)
ELSE:
→ Mark PLAN.md: TODO {N}.{suffix} — FAILED
→ Update audit.md: Status = FAILED
→ halt("dynamic_todo_failed")
Safety limits:
- •depth=1: Dynamic TODOs (depth=1) use the same reconcile flow but
adapt()is blocked at depth≥1. Retry (max 3) still works. - •Max 3 dynamic TODOs per original TODO. 4th attempt → halt.
7. Audit logging — single file audit.md:
All reconciliation events go to one file. Replaces decisions.md + amendments.md.
## TODO {N} — Reconciliation
### [YYYY-MM-DD HH:MM] Triage
- acceptance_criteria:login_test FAIL → retry
- must_not_do:no_git_commands warning → logged (non-blocking)
- suggested_adaptation:scope_violation → adapt
### [YYYY-MM-DD HH:MM] Retry #1
- Fix prompt sent, re-verified → FAIL
- acceptance_criteria:login_test FAIL → retry
### [YYYY-MM-DD HH:MM] Adapt
- **Dynamic TODO**: {N}.a — {title}
- **Trigger**: {blockage_type}
- **Scope**: safe (needed_for_DoD=YES)
- **Status**: COMPLETED | FAILED
### [YYYY-MM-DD HH:MM] Halted (if applicable)
- **Reason**: {reason}
- **Evidence**: {evidence}
Mode-specific halt behavior:
- •Local: Record in
issues.md, report to user, offer Continue/Stop. Plan checkbox stays[ ]. - •PR:
Skill("state", args="pause <PR#> <reason>")→ records "Blocked" comment → stop execution.
Full reconciliation details → REFERENCE C
2c. :Wrap-up — Save Context + Mark Checkboxes
Combines context saving and checkbox marking into a single step. Only runs after :Verify completes (VERIFIED, or reconciliation resolved).
Part A: Save to Context Files
| Source | Field | File | Format |
|---|---|---|---|
| Worker | outputs | outputs.json | existing["todo-N"] = outputs → Write |
| Worker | learnings | learnings.md | ## {N}\n- item append |
| Worker | issues | issues.md | ## {N}\n- [ ] item append |
| Verify | side_effects.missing_context | learnings.md / issues.md | Merge into appropriate file |
| Verify | side_effects.undocumented_changes | issues.md | - [ ] Undocumented: {change} append |
| Orchestrator | all reconciliation events | audit.md | structured entry (see section 7 format) |
acceptance_criteria | (not saved) | Used only for verification, not saved to context |
Skip empty arrays.
⚠️ outputs.json race condition: When multiple :Wrap-up tasks run in parallel, save outputs.json sequentially (Read → merge → Write one at a time). Other context files are safe for parallel append.
# Parallel 1.3:Wrap-up and 3.3:Wrap-up both runnable:
# outputs.json — SEQUENTIAL:
current = Read("outputs.json")
current["todo-1"] = result1.outputs
Write("outputs.json", current)
current = Read("outputs.json")
current["todo-3"] = result3.outputs
Write("outputs.json", current)
# learnings.md, issues.md — PARALLEL OK (append mode)
Part B: Mark Plan Checkboxes
1. Update Plan TODO checkbox:
Edit(plan_path, "### [ ] TODO N: ...", "### [x] TODO N: ...")
2. Update Acceptance Criteria checkboxes based on :Verify results:
The :Verify step produces acceptance_criteria with per-item status (PASS/FAIL). Use this to check individual items:
FOR EACH criterion in verify_result.acceptance_criteria:
IF criterion.status == "PASS":
Edit(plan_path,
"- [ ] {criterion.description}",
"- [x] {criterion.description}")
⚠️ Caution:
- •Only check items whose
statusisPASSfrom the:Verifyresult - •Do not check based on SubAgent report alone — use verify worker result
- •Items with
status: FAILremain- [ ] - •Do NOT check Steps items (
- [ ]under**Steps**:) — only Acceptance Criteria
On completion: TaskUpdate(taskId, status="completed") → :Commit becomes runnable (if exists), or next TODO's :Worker is unblocked.
2e. :Commit — Per-TODO Commit via git-master
Find matching row in Plan's ## Commit Strategy table:
- •
Condition: always→ commit - •
Condition: {cond}→ evaluate condition - •No row → this sub-step should not have been created (see 1.3)
Task(
subagent_type="git-master",
description="Commit {N}",
prompt="""
Commit TODO {N} changes.
Commit message: {Message from Commit Strategy table}
Files: {Files from Commit Strategy table}
Push after commit: {YES if PR mode, NO if Local mode}
"""
)
If commit fails, log to issues.md and report to user.
On completion: TaskUpdate(taskId, status="completed") → next TODO's :Worker is unblocked (if cross-TODO dependency exists).
2f. :Residual Commit — Check & Commit Remaining Changes
git status --porcelain
If changes exist (context files, unexpected modifications):
Task(
subagent_type="git-master",
description="Commit: residual changes",
prompt="""
Plan execution complete. Run `git status`.
If changes: commit "chore({plan-name}): miscellaneous changes"
Push: {YES if PR mode, NO if Local mode}
If clean: report "No uncommitted changes" and exit.
"""
)
On completion: TaskUpdate(taskId, status="completed") → :State Complete (PR) or :Report becomes runnable.
2g. :State Complete — [PR Mode Only] Complete PR State
Skill("state", args="complete <PR#>")
Removes state:executing label, converts Draft → Ready, adds "Published" comment.
On completion: TaskUpdate(taskId, status="completed") → :Report becomes runnable.
2h. :Report — Final Orchestration Report
TaskUpdate(report.id, status="in_progress")
template = Read("${baseDir}/references/report-template.md") ← actual file read
# Output report verbatim, replacing {placeholders} with real values
# Do NOT invent your own format — follow the template exactly
TaskUpdate(report.id, status="completed")
Why Read instead of TaskGet: The template lives in references/report-template.md. Reading it immediately before output keeps the template in close context and prevents the agent from generating a custom format.
On completion: TaskUpdate(taskId, status="completed") → all tasks done, execution ends.
STEP 3: Finalize
Finalize tasks (Residual Commit, State Complete, Report) are dispatched in the execution loop as :Residual Commit, :State Complete, :Report handlers (2f, 2g, 2h).
REFERENCE
A. Variable Substitution Details
All TODO outputs are stored in context/outputs.json:
{
"todo-1": { "config_path": "./config/app.json" },
"todo-2": { "api_module": "src/api/index.ts" }
}
Substitution logic:
- •Read
context/outputs.json - •Find
${todo-N.outputs.field}pattern in current TODO's Inputs - •Extract value from JSON and replace
- •Include substituted value in Worker prompt
B. Context System
| File | Writer | Purpose |
|---|---|---|
outputs.json | Worker → Orchestrator saves | TODO output values (input for next TODO) |
learnings.md | Worker → Orchestrator saves | Patterns discovered and applied |
issues.md | Worker → Orchestrator saves | Unresolved issues (- [ ] format) |
audit.md | Orchestrator | All reconciliation events (triage, retry, adapt, halt) |
Context Lifecycle:
Before TODO #1 → Read context → inject into prompt
After TODO #1 → Save outputs + learnings
Before TODO #2 → Read outputs.json → substitute ${todo-1.outputs.X}
After TODO #2 → Update outputs.json + append learnings
(Accumulates. Preserved in files across sessions.)
C. Reconciliation Details
K8s-style reconciliation pattern (Worker → Verify Worker → triage):
Desired State: All acceptance_criteria PASS, no critical violations Current State: Verify Worker result [VERIFIED] → Save context (2c) — include missing_context from Verify [FAILED] → reconcile(TODO_N, verify_result, depth=0): log_non_blocking() → triage(verify_result, todo_type) → single disposition (halt > adapt > retry): halt → Stop execution, log to audit.md + issues.md adapt → Create dynamic TODO via reconcile(depth+1), then retry original retry → Fix Worker → re-verify (max 3), re-triage each cycle All triage decisions + retry attempts logged to audit.md
Decision trail: All events (triage, retry attempts, adapt, halt) recorded in single audit.md. Non-blocking items (warnings, undocumented changes) logged but don't produce a disposition.
Dynamic TODO rules:
- •Created by
adapt()— uses samereconcile()flow at depth=1 - •Can be retried (max 3) but cannot trigger further
adapt()(depth≥1 blocks it) - •Max 3 dynamic TODOs per original TODO
- •On failure → halt
issues.md log format (on halt):
## [YYYY-MM-DD HH:MM] {TODO name} Failed
**Category**: env_error | unknown
**Error**: {error message}
**Retry Count**: {n}
**Analysis**: {why human intervention needed}
**Suggestion**: {recommended action}
D. Commit Strategy Details
Per-TODO commit flow:
- •Parse Commit Strategy table from PLAN.md
- •Find matching row for current TODO number
- •
Condition: always→ commit;Condition: {cond}→ evaluate; No row → skip - •Delegate to
git-masteragent - •Wait for completion before next TODO
Push decision:
| Mode | Push after commit |
|---|---|
| PR mode | YES |
| Local mode | NO |
E. Parallelization Examples (Sub-Step Model)
Setup: PR mode. TODO 1 (independent), TODO 2 (depends on TODO 1), TODO 3 (independent). TODO 1 and TODO 3 have commits; TODO 2 does not.
TaskList() after initialization: #1 [pending] Init:State Begin #2 [pending] 1.1:Worker — Config setup [blocked by #1] #3 [pending] 1.2:Verify [blocked by #2] #4 [pending] 1.3:Wrap-up [blocked by #3] #5 [pending] 1.4:Commit [blocked by #4] #6 [pending] 2.1:Worker — API [blocked by #5] ← cross-TODO dep #7 [pending] 2.2:Verify [blocked by #6] #8 [pending] 2.3:Wrap-up [blocked by #7] #9 [pending] 3.1:Worker — Utils [blocked by #1] #10 [pending] 3.2:Verify [blocked by #9] #11 [pending] 3.3:Wrap-up [blocked by #10] #12 [pending] 3.4:Commit [blocked by #11] #13 [pending] Finalize:Residual Commit [blocked by #5, #8, #12] ← #5=1.4:Commit, #8=2.3:Wrap-up, #12=3.4:Commit #14 [pending] Finalize:State Complete [blocked by #13] #15 [pending] Finalize:Report [blocked by #14]
Execution Rounds (auto-determined by TaskList):
Round 0: #1 Init:State Begin ← PR only Round 1: #2 1.1:Worker, #9 3.1:Worker ← PARALLEL Round 2: #3 1.2:Verify, #10 3.2:Verify ← PARALLEL Round 3: #4 1.3:Wrap-up, #11 3.3:Wrap-up ← PARALLEL (outputs.json sequential!) Round 4: #5 1.4:Commit, #12 3.4:Commit ← PARALLEL Round 5: #6 2.1:Worker ← unblocked after #5 Round 6: #7 2.2:Verify Round 7: #8 2.3:Wrap-up Round 8: #13 Finalize:Residual Commit ← blocked by all TODO last steps Round 9: #14 Finalize:State Complete ← blocked by #13 Round 10: #15 Finalize:Report ← blocked by #14
F. Session Recovery
Plan checkbox is the only durable state, so recovery = fresh start:
### [x] TODO 1: Config setup ← skip (complete) ### [ ] TODO 2: API implementation ← create sub-step Tasks ### [x] TODO 3: Utils ← skip (complete) ### [ ] TODO 4: Integration ← create sub-step Tasks
- •Parse checkboxes → only unchecked TODOs
- •Create sub-step Tasks for each unchecked TODO (Worker, Verify, Wrap-up, Commit)
- •Set intra-TODO chains + cross-TODO dependencies (only between unchecked)
- •Load
outputs.json(variable substitution works if prior outputs saved) - •Resume execution loop — dispatch picks up from where it left off
Why recovery is simple:
- •No need to worry about Task system state (always recreated from scratch)
- •Can see progress from Plan checkbox alone
- •Variable substitution works normally if
outputs.jsonexists - •Sub-step Tasks are ephemeral — recreated each session
G. State & Task System
Plan checkbox = only source of truth. Task system = sub-step parallelization helper (recreated each session).
Task types: Init (1, PR only) + per-TODO sub-steps (up to 4 each) + Finalize (2-3):
| Sub-Step | Subject Pattern | Purpose |
|---|---|---|
:State Begin | Init:State Begin | [PR only] Begin PR state |
:Worker | {N}.1:Worker — {title} | Delegate implementation to worker agent |
:Verify | {N}.2:Verify | Dispatch verify worker, triage & reconcile if FAILED |
:Wrap-up | {N}.3:Wrap-up | Save context (Worker + Verify) + mark Plan [x] |
:Commit | {N}.4:Commit | Commit via git-master (only if Commit Strategy row exists) |
:Residual Commit | Finalize:Residual Commit | Check & commit remaining changes |
:State Complete | Finalize:State Complete | [PR only] Complete PR state |
:Report | Finalize:Report | Output final orchestration report |
Task tools:
| Tool | Role | When |
|---|---|---|
| TaskCreate | TODO → sub-step Tasks | Session start |
| TaskUpdate | Dependency (addBlocks) / completion | After create / after each sub-step |
| TaskList | Find runnable sub-steps | Every loop iteration |
| TaskGet | Query details | Before worker prompt |
Dependency types:
# Init (PR only):
Init:State Begin → all {N}.1:Worker tasks
# Intra-TODO chain (always):
{N}.1:Worker → {N}.2:Verify → {N}.3:Wrap-up → {N}.4:Commit
# Cross-TODO (from Dependency Graph):
1.4:Commit (or 1.3:Wrap-up) → 2.1:Worker
# Finalize chain:
all TODO last steps → Residual Commit → State Complete (PR) → Report
Usage: TaskUpdate(status="in_progress") — before dispatching. TaskUpdate(status="completed") — after sub-step finishes. Both are used.
⚠️ Why in_progress matters: With parallel dispatch, TaskList().filter(status=="pending") is used to find runnable tasks. Without marking dispatched tasks as in_progress, the next loop iteration would re-dispatch them. Always mark in_progress BEFORE dispatching.
H. Mode Differences (PR vs Local)
| Item | Local Mode | PR Mode |
|---|---|---|
| Spec location | .dev/specs/{name}/PLAN.md | Parse from PR body |
| State management | Plan checkbox only | Plan checkbox + /state skill |
| History | Context files | Context + PR Comments |
| Block handling | Record in context, report to user | Skill("state", args="pause") |
| After completion | Per-TODO commits → Report | Commits + push → /state complete |
I. Checklist Before Stopping
1. Init Tasks (PR Mode Only):
- •
Init:State Begintask created and completed? - • Stopped immediately on failure?
2. Task Initialization:
- • Identified unchecked TODOs from Plan?
- • Created sub-step Tasks (Worker, Verify, Wrap-up, Commit) for each unchecked TODO?
- • Intra-TODO chains set (Worker→Verify→Wrap-up→Commit)?
- • Cross-TODO dependencies set from Dependency Graph?
3. Execution Phase:
- • No pending Tasks in TaskList?
- • TaskUpdate(status="completed") on each sub-step?
- • All
:Workertasks delegated to worker agent? - • All
:Verifytasks dispatched verify worker + triaged + reconciled if needed? - • All
:Wrap-uptasks saved context + marked Plan[x]+ logged triage decisions? - • All
:Committasks delegated to git-master? - • Pushed after each commit (PR mode)?
4. Finalize Tasks:
- •
Finalize:Residual Committask completed? - •
Finalize:State Completetask completed? (PR mode only) - •
Finalize:Reporttask completed? - • All Finalize tasks dispatched through execution loop?
Exception Handling:
- •
Skill("state", args="pause <PR#> <reason>")on block? (PR) - •
issues.mdupdated on block? (Local)
Continue working if any item incomplete.