Autonomous Claude Code Execution
Execute tasks using Claude Code CLI in autonomous mode. Handles both single prompts and multi-step TODO lists, with iterative review, commit, and push after each completion.
This command is invoked FROM Codex CLI. All execution logic is self-contained here.
Boy Scout Principle
- •Each Claude run must leave the repository cleaner: remove AI cruft uncovered during execution, resolve lint/test warnings encountered, and tighten docs touched by the automation
- •When a TODO file or issue reveals unrelated debt, clean it before continuing so the next invocation starts from a healthier baseline
- •Record every cleanup in the reported evidence (tests, diffs, CI logs) to prove incremental improvement alongside task completion
OPERATING MODES
Mode 1: Single Prompt Execution
User provides an explicit task as a string argument.
Example: /autonomous "Add error handling to the parser module"
The command will:
- •Construct
claudeCLI invocation with prompt - •Execute Claude Code non-interactively
- •Capture and parse output
- •Review changes against CLAUDE.md criteria
- •Apply fixes if needed
- •Stage files explicitly, commit, and push
- •Return result to user
Mode 2: TODO List Execution
User provides path to markdown file or GitHub issue with actionable TODO items.
Example: /autonomous tasks/refactor-module.md or /autonomous 42
The command will:
- •Parse all TODO items from source
- •For each item (loop until complete):
- •Execute item via Claude Code CLI
- •Review and apply fixes
- •Update TODO document with evidence
- •Commit and push changes
- •Report progress (2-3 sentences)
- •Proceed immediately to next item
- •Final comprehensive report
REQUIRED INPUTS
TASK_INPUT (required): One of:
- •Explicit prompt string (quoted)
- •Path to markdown TODO file
- •GitHub issue number or URL
MISSION_CLASS (optional): implementation, review, refactor, etc.
INPUT DETECTION
Automatically detect input type:
- •Check if input is a file path (exists on disk) -> Mode 2
- •Check if input is numeric or GitHub issue URL -> Mode 2
- •Otherwise treat as explicit prompt string -> Mode 1
ARCHITECTURAL PRINCIPLES
- •Maintain strict separation of concerns: this command orchestrates workflows, Claude Code performs generation, and post-run reviews enforce standards
- •Keep each execution focused on a single responsibility; split mixed-scope TODO items into discrete runs before invoking the command
- •Share only the minimal context (task text, evidence, constraints) between command and agent layers to preserve weak coupling and make independent upgrades safe
EXECUTION PROCEDURE
Mode 1: Single Prompt Flow
- •
Prepare Context
- •Repository root: current working directory
- •Current branch:
git rev-parse --abbrev-ref HEAD - •Workspace status:
git status --short
- •
Execute Claude CLI
bashclaude --print --dangerously-skip-permissions "{{prompt}}"Capture stdout, stderr, and exit code
Note:
--printenables non-interactive mode,--dangerously-skip-permissionsbypasses all permission checks - •
Parse Results
- •Extract file changes from output
- •Identify test commands run
- •Collect evidence (test output, build logs)
- •
Review Against CLAUDE.md
- •Apply all review criteria (see section below)
- •Check for violations
- •Identify necessary fixes
- •
Apply Fixes
- •Implement corrections based on review
- •Re-run tests if fixes applied
- •Ensure all evidence captured
- •
Git Operations
- •Stage each modified file explicitly
- •Never use
git add .orgit add -A - •Commit with clear message
- •Push to remote
- •
Report
- •2-3 sentence summary
- •Evidence references
- •Files changed and committed
Mode 2: TODO List Flow
- •
Initialize
- •Load TODO source (file or GitHub issue via
gh issue view) - •Parse actionable items (checkboxes, numbered lists)
- •Enumerate in execution order
- •Verify document is well-formed
- •Load TODO source (file or GitHub issue via
- •
For Each TODO Item (loop until complete)
a. Prepare Item Context
- •Current item objective
- •Completed items (for context)
- •Remaining items (for planning)
- •Repository state (branch, workspace)
b. Execute via Claude CLI
bashclaude --print --dangerously-skip-permissions "{{item_text}} Context: - Completed: {{completed_items}} - Remaining: {{remaining_items}} - Constraints: All CLAUDE.md standards apply "c. Capture and Validate
- •Parse Claude output
- •Extract file changes
- •Collect test/build evidence
- •Verify success claims with proof
d. Review Against CLAUDE.md
- •Apply all mandatory checks (see criteria below)
- •Identify violations or needed fixes
- •Assess if changes meet standards
e. Apply Fixes
- •Implement necessary corrections
- •Re-run verification if fixes applied
- •Ensure full compliance
f. Update TODO Document
- •Mark item as complete with checkbox
- •Add evidence references (test output, file paths)
- •Note any observations or follow-ups
- •Save changes to file or update GitHub issue
g. Git Operations
- •Stage modified files explicitly
- •Stage updated TODO document
- •Commit with message: "{{item_summary}} (step N/M)"
- •Push immediately
h. Report Progress
- •"Step N/M complete: {{summary}}. {{evidence}}. Proceeding to step N+1."
- •Do NOT wait for user acknowledgment
- •Continue immediately to next item
- •
Completion
- •Verify all items marked complete
- •Confirm all evidence captured
- •Final report: "All N items complete. N commits pushed. {{summary}}."
CLAUDE CLI INVOCATION PATTERNS
Basic execution:
claude --print --dangerously-skip-permissions "{{prompt}}"
With output format (JSON):
claude --print --output-format json --dangerously-skip-permissions "{{prompt}}"
With streaming JSON output:
claude --print --output-format stream-json --dangerously-skip-permissions "{{prompt}}"
With permission mode (alternative to --dangerously-skip-permissions):
claude --print --permission-mode bypassPermissions "{{prompt}}"
With specific model (if needed):
claude --print --dangerously-skip-permissions --model sonnet "{{prompt}}"
Continue previous conversation:
claude --print --dangerously-skip-permissions --continue "{{prompt}}"
CLAUDE.MD REVIEW CRITERIA
After every execution (single or per TODO item), review ALL changes:
Non-Negotiables
- • No stubs, placeholders, commented-out code, or suppressions
- • Success claims backed by evidence (CI logs, test output, command results)
- • Files staged explicitly (verify git commands used)
- • Repo build/test scripts executed
Code Quality
- • Modules <500 lines (hard limit 1000)
- • Functions <50 lines (hard limit 100)
- • Follow repo's established style and conventions
- • Run repo's formatter if applicable
- • Files end with newline
Design Principles
- • Reuse-first: existing code modified/extended
- • Nearby duplication eliminated
- • No shallow wrappers or over-abstraction
- • Input validation present
- • No hardcoded secrets/keys/passwords
- • Structure-of-arrays over array-of-structures where applicable
- • Inner loops over leftmost (fastest-varying) index
- • Obsolete/dead code removed
Testing and Build
- • MANDATORY: 100% test pass rate - NO EXCEPTIONS - ALL tests MUST pass
- • CRITICAL: ALL regressions are YOUR fault - main branch ALWAYS has 100% passing tests
- • FORBIDDEN: Never assume tests were failing on main - ALWAYS 100% pass on main
- • Test duration <=120s each
- • Build commands execute successfully
- • Linter/formatter applied if repo requires it
Git/GitHub Discipline
- • Explicit file staging only (check for
git add .- forbidden!) - • No emojis in commits
- • Commit messages clear and descriptive
- • Changes pushed to remote
FIX APPLICATION WORKFLOW
When review identifies issues:
- •
Categorize Issues
- •Critical: violates non-negotiables (stubs, blanket git adds)
- •Major: violates hard limits (file size, function size)
- •Minor: style violations (formatting, naming)
- •
Apply Fixes
- •Critical: fix immediately, re-run full test suite
- •Major: fix and re-verify affected functionality
- •Minor: apply and spot-check
- •
Re-validate
- •Run tests again if logic changed
- •Re-check all failed criteria
- •Ensure no new issues introduced
- •
Document
- •Note fixes applied in commit message
- •Add to TODO update if in Mode 2
GIT OPERATIONS PROTOCOL
Always follow this exact sequence:
- •
Check Status
bashgit status --short
- •
Stage Files Explicitly
bash# CORRECT: git add src/module.f90 test/test_module.f90 tasks/refactor.md # FORBIDDEN: git add . git add -A git add --all
- •
Commit with HEREDOC
bashgit commit -m "$(cat <<'EOF' Brief summary of changes Details: - Specific change 1 - Specific change 2 Evidence: tests pass (fpm test output) EOF )"
- •
Push Immediately
bashgit push
AUTONOMOUS EXECUTION GUARANTEES
- •No pausing between TODO items
- •No user approval prompts
- •Only stop for unresolvable blockers
- •All changes reviewed and committed automatically
- •Continuous progress reporting
ERROR HANDLING
If Claude CLI execution fails:
- •
Capture Full Context
- •Exit code
- •Stdout and stderr
- •Git status before/after
- •Attempted command
- •
Assess Resolvability
- •Can be fixed by adjusting prompt?
- •Missing dependencies or files?
- •Constraint violation?
- •
Attempt Fix (if possible)
- •Modify approach and retry
- •Add missing context
- •Adjust constraints
- •
Escalate (if unresolvable)
- •Mark TODO item as blocked
- •Report to user with:
- •Objective
- •Error description
- •Evidence of failure
- •Attempted resolutions
RELATIONSHIP WITH /codex COMMAND
Key distinction:
- •
/codex: Runs FROM Claude Code, uses Codex CLI, delegates to codex agent - •
/autonomous: Runs FROM Codex CLI, uses Claude Code CLI, self-contained logic
Both commands support dual modes (single prompt / TODO list) with identical autonomous execution guarantees and review standards.