Purpose

Follow a project's testing runbook and execute its commands safely.

This skill is designed to offload repetitive testing steps (often remote via ssh) to a cheaper model while keeping the main session focused on design and implementation.

Runbook discovery

Find the runbook in this priority order:

•If invoked with an argument, treat it as the runbook path.
•Otherwise, read CLAUDE.md in the project root and look for a line: Test runbook: <path>
•If still unclear, ask the user for the runbook path.

What counts as an executable step

Only execute commands found in fenced code blocks.

•Prefer blocks fenced as bash (or sh).
•Do not execute prose steps.
•Do not execute non-shell code blocks.

If a runbook includes "recommended" commands that require adaptation (placeholders like <host>, $VAR, ..., or comments like "edit as needed"), ask a clarifying question before running them.

Safety rules (non-negotiable)

•Read-only posture: do not edit project files as part of this skill.
•Do not run destructive commands that could render the host/VM unusable. If a runbook command looks destructive or high-risk, stop and ask.
•Only run commands explicitly present in executable fenced blocks.
•If a runbook requires secrets, ask for them; do not guess or fabricate.

Examples of commands to treat as destructive/high-risk:

•disk/filesystem: dd, mkfs, newfs, fsck -y, disklabel, gpt destroy, zpool destroy, wipefs
•system power: shutdown, reboot, halt, poweroff, init 0, init 6
•mass deletion: rm -rf /, or rm -rf with ambiguous paths

Execution workflow

•Identify the runbook and summarize what you will run.
•Extract executable command blocks in order.
•Execute commands step-by-step.
•Stop at the first failure unless the runbook explicitly labels a step as non-fatal.
•Report results in a detailed per-step format.

Output format

For each executed step, report:

•Step name (if present) and where it came from (runbook section heading).
•The exact command executed.
•Exit code.
•Key output (error lines and a small amount of surrounding context).

At the end, provide:

•Overall status: pass/fail
•First failing step (if any)
•Recommended next diagnostic command(s) (only if safe)