Intent
Execute code or commands in a controlled manner, capturing output for verification. Unlike mutate, this capability is for operations that don't permanently change state (tests, queries, builds, analysis tools).
Success criteria:
- •Code/command executed successfully
- •Output captured completely
- •Exit code recorded
- •Errors properly surfaced
Compatible schemas:
- •
schemas/output_schema.yaml
Inputs
| Parameter | Required | Type | Description |
|---|---|---|---|
code | Yes | string | Code or command to execute |
language | No | string | Programming language or shell (bash, python, ruby, etc.) |
timeout | No | string | Maximum execution time (default: "60s") |
environment | No | object | Environment variables to set |
Procedure
- •
Validate execution request: Ensure code is safe to run
- •Check for mutation operations (if found, suggest
mutateinstead) - •Verify timeout is reasonable
- •Confirm execution environment
- •Check for mutation operations (if found, suggest
- •
Prepare environment: Set up execution context
- •Set required environment variables
- •Ensure dependencies are available
- •Create isolated context if needed
- •
Execute code: Run the code/command
- •Capture stdout and stderr
- •Record start time
- •Monitor for timeout
- •
Capture results: Collect execution output
- •Record exit code
- •Capture complete stdout
- •Capture complete stderr
- •Note execution duration
- •
Analyze output: Interpret results
- •Identify success/failure from exit code
- •Extract key information from output
- •Note warnings or anomalies
- •
Return results: Structure output for consumption
- •Include all captured data
- •Provide execution summary
- •Reference evidence for assertions
Output Contract
Return a structured object:
result: success: boolean # Exit code == 0 exit_code: integer # Process exit code stdout: string # Standard output stderr: string # Standard error duration: string # Execution time execution: command: string # What was executed language: string # Execution environment started_at: string # ISO timestamp completed_at: string # ISO timestamp analysis: summary: string # One-line result summary warnings: array[string] # Notable warnings errors: array[string] # Extracted error messages evidence_anchors: ["command:output"]
Field Definitions
| Field | Type | Description |
|---|---|---|
result.success | boolean | Whether execution succeeded |
result.exit_code | integer | Process exit code |
result.stdout | string | Standard output |
result.stderr | string | Standard error |
result.duration | string | How long execution took |
execution | object | Execution metadata |
analysis | object | Interpreted results |
Examples
Example 1: Run Tests
Input:
code: "npm test -- --grep 'UserService'" language: "bash" timeout: "120s"
Output:
result:
success: true
exit_code: 0
stdout: |
> project@1.0.0 test
> jest --grep 'UserService'
PASS src/services/__tests__/UserService.test.ts
UserService
✓ creates user with valid data (45ms)
✓ validates email format (12ms)
✓ hashes password on save (23ms)
Test Suites: 1 passed, 1 total
Tests: 3 passed, 3 total
stderr: ""
duration: "2.3s"
execution:
command: "npm test -- --grep 'UserService'"
language: "bash"
started_at: "2024-01-15T10:30:00Z"
completed_at: "2024-01-15T10:30:02Z"
analysis:
summary: "3 tests passed in UserService"
warnings: []
errors: []
evidence_anchors:
- "command:npm test:output"
Example 2: Execute Query
Input:
code: "SELECT COUNT(*) as user_count FROM users WHERE created_at > '2024-01-01'"
language: "sql"
environment:
DATABASE_URL: "${DATABASE_URL}"
Output:
result:
success: true
exit_code: 0
stdout: |
user_count
----------
15423
(1 row)
stderr: ""
duration: "0.15s"
execution:
command: "psql -c \"SELECT COUNT(*)...\""
language: "sql"
started_at: "2024-01-15T10:35:00Z"
completed_at: "2024-01-15T10:35:00Z"
analysis:
summary: "Query returned 15423 users created in 2024"
warnings: []
errors: []
evidence_anchors:
- "command:psql:query_result"
Example 3: Execution Failure
Input:
code: "python -c 'import nonexistent_module'" language: "bash"
Output:
result:
success: false
exit_code: 1
stdout: ""
stderr: |
Traceback (most recent call last):
File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'nonexistent_module'
duration: "0.05s"
execution:
command: "python -c 'import nonexistent_module'"
language: "bash"
started_at: "2024-01-15T10:40:00Z"
completed_at: "2024-01-15T10:40:00Z"
analysis:
summary: "Import failed: module 'nonexistent_module' not found"
warnings: []
errors:
- "ModuleNotFoundError: No module named 'nonexistent_module'"
evidence_anchors:
- "command:python:error"
Verification
- • Exit code is captured
- • Stdout and stderr are both recorded
- • Duration is reasonable
- • Analysis summary matches output
- • No mutations occurred
Verification tools: Bash (to verify command execution environment)
Safety Constraints
- •
mutation: false - •
requires_checkpoint: false - •
requires_approval: true - •
risk: medium
Capability-specific rules:
- •Verify code does not mutate persistent state
- •Enforce timeout limits
- •Capture all output for audit trail
- •Do not execute code that requires elevated privileges without approval
- •Prefer
mutatefor state-changing operations
Composition Patterns
Commonly follows:
- •
plan- Execute planned actions - •
generate- Execute generated code - •
transform- Execute transformation scripts
Commonly precedes:
- •
verify- Verify execution results - •
detect- Detect patterns in output - •
audit- Record execution for audit
Anti-patterns:
- •Never use execute for persistent state changes (use
mutate) - •Avoid execute without timeout limits
- •Never execute untrusted code without sandboxing
Workflow references:
- •See
reference/workflow_catalog.yaml#debug_code_changefor execute in testing - •See
reference/workflow_catalog.yaml#digital_twin_sync_loopfor execute in sync loops