examples-auto-run

Name: examples-auto-run
Rating: 92
Author: openai

What it does

•Runs pnpm build && pnpm -r build-check first
•Runs pnpm examples:start-all in auto-input mode (interactive prompts are auto-answered, HITL/MCP/apply-patch are auto-approved).
•Executes starts in parallel (default concurrency 4) and pipes each start’s stdout/stderr into its own log file under .tmp/examples-start-logs/.
•Provides start/stop/status/logs/tail helpers via run.sh.
•If the Codex session ends (no disown/nohup), the child processes receive SIGHUP and exit; stop is also available to clean up manually.

Usage

bash

# Start (auto mode, concurrency=4 by default)
.codex/skills/examples-auto-run/scripts/run.sh start [extra args to examples:start-all]
# If you invoke the skill name alone ($examples-auto-run):
#   - when `.tmp/examples-rerun.txt` exists and is non-empty, it will run `rerun` automatically
#   - otherwise it runs the default `start` command.

# Examples:
.codex/skills/examples-auto-run/scripts/run.sh start --filter basic
.codex/skills/examples-auto-run/scripts/run.sh start --include-server --include-audio

# Check status
.codex/skills/examples-auto-run/scripts/run.sh status

# Stop running job (kills pid from .tmp/examples-auto-run.pid)
.codex/skills/examples-auto-run/scripts/run.sh stop

# List logs (per start script)
.codex/skills/examples-auto-run/scripts/run.sh logs

# Tail latest log
.codex/skills/examples-auto-run/scripts/run.sh tail
.codex/skills/examples-auto-run/scripts/run.sh tail basic__start_hello-world.log

# After a run, build a rerun list from the latest main log (auto-skip list is imported from `scripts/run-example-starts.mjs` and server/audio/external skips are honored)
.codex/skills/examples-auto-run/scripts/run.sh collect
# Rerun only the entries in .tmp/examples-rerun.txt
.codex/skills/examples-auto-run/scripts/run.sh rerun
# Show the current auto-skip list (env or defaults)
.codex/skills/examples-auto-run/scripts/run.sh start --print-auto-skip --dry-run

Defaults (overridable via env)

•EXAMPLES_INTERACTIVE_MODE=auto
•AUTO_APPROVE_MCP=1, APPLY_PATCH_AUTO_APPROVE=1, AUTO_APPROVE_HITL=1 (set in runner)
•EXAMPLES_CONCURRENCY=4
•EXAMPLES_EXECA_TIMEOUT_MS=300000 (5m)
financial-research-agent and computer-use use 10m inside the script.
•
Includes interactive; excludes server/audio/external by default:
- •EXAMPLES_INCLUDE_INTERACTIVE=1
- •EXAMPLES_INCLUDE_SERVER=0
- •EXAMPLES_INCLUDE_AUDIO=0
•
EXAMPLES_INCLUDE_EXTERNAL=0
- •This means realtime-* / nextjs (tagged as server/audio) are skipped unless you opt in with --include-server / --include-audio or the corresponding env flags.
•Auto-skip list: EXAMPLES_AUTO_SKIP (comma/space separated) overrides the built-in defaults used by both run.sh and run-example-starts.mjs. Defaults include agent-patterns:start:llm-as-a-judge, agent-patterns:start:routing, customer-service:start, connectors:start, mcp:start:hosted-mcp-on-approval, mcp:start:hosted-mcp-human-in-the-loop.

Cancellation / cleanup

•Jobs are backgrounded but not disowned; if Codex suspends/ends the shell, the process group gets SIGHUP and stops.
•Manual cleanup: run.sh stop (removes stale pid if already exited).

Log locations

•.tmp/examples-start-logs/<package>__<script>.log (per start)
•Main runner log path is printed when start is invoked.
•Rerun list (generated by collect): .tmp/examples-rerun.txt (one package:script per line).

Notes

•Auto-skip is centralized (same defaults as above) and can be overridden via EXAMPLES_AUTO_SKIP. Auto-skip entries are excluded from rerun collection and will be removed from rerun execution automatically.
•Auto-input map covers common interactive prompts; HITL/MCP/apply-patch auto-approve via env is enabled by the runner.
•Shell tool approvals are auto-approved in auto mode (SHELL_AUTO_APPROVE=1).
•rerun runs entries sequentially, continues after failures, and rewrites .tmp/examples-rerun.txt with only the remaining failures. Auto-skip entries are not re-added.
•
Behavioral validation is not done in the runner, so Codex must immediately perform it after every start or rerun invocation without waiting for the user to ask. Required steps:
1. •Read the example source to infer intended flow from code/comments (tools invoked, expected outputs, guards, approvals).
2. •Read the matching log under .tmp/examples-start-logs/.
3. •Compare intent vs. log: confirm key actions/results happened; flag omissions or divergences.
4. •Do this for all exit-0 entries, not just samples.
5. •Summarize findings right after the run completes; when “OK”, note what was checked (e.g., “tools called + final message emitted”).
6. •When reporting, do not omit or ellipsize outputs that justify the validation; include the full relevant lines (keep it concise but untruncated).
•The runner prints a full table after the summary: one row per start script with status, package:script, info (reason/exit/skipped), and the log path. If the run stops before the table appears, point the analyzer at the latest main_*.log to reconstruct a table and validations.