Test Scenario Skill
Description
Executes a sandbox scenario to test agent diagnostic capabilities and validates the output against expected diagnosis.
Usage
/test-scenario <scenario-name>
Example: /test-scenario memory-leak
What This Skill Does
- •Loads scenario from
sandbox/scenarios/<scenario-name>.scenario.ts - •Injects mock tool responses
- •Triggers Orchestrator agent with incident report
- •Captures agent's diagnostic output
- •Validates:
- •Root cause includes expected keywords
- •Confidence meets minimum threshold
- •Remediation suggestions align with expectations
- •Reports pass/fail with detailed comparison
- •If failed: suggests which agent prompts or tools to adjust
Validation Output Format
code
Scenario: memory-leak
Status: PASSED ✅
Validation Results:
✅ Root cause match: Found keywords [OOMKilled, memory, heap]
✅ Confidence threshold: 0.89 >= 0.85
✅ Remediation alignment: 3/3 suggestions match expected strategies
Agent Diagnosis:
Root Cause: Memory leak in payment-service causing repeated OOMKills
Confidence: 0.89
Evidence:
- OutOfMemoryError in logs (47 occurrences)
- Memory usage trending upward to 98% of limit
- Pod restarted 5 times in last hour
Remediation:
- Rollback to v1.5.1 (stable version)
- Increase memory limit to 2Gi temporarily
- Profile memory usage to identify leak