Test Scenario Skill

Name: Test Scenario
Rating: 87
Author: darenhart

Description

Executes a sandbox scenario to test agent diagnostic capabilities and validates the output against expected diagnosis.

Usage

/test-scenario <scenario-name>

Example: /test-scenario memory-leak

What This Skill Does

•Loads scenario from sandbox/scenarios/<scenario-name>.scenario.ts
•Injects mock tool responses
•Triggers Orchestrator agent with incident report
•Captures agent's diagnostic output
•
Validates:
- •Root cause includes expected keywords
- •Confidence meets minimum threshold
- •Remediation suggestions align with expectations
•Reports pass/fail with detailed comparison
•If failed: suggests which agent prompts or tools to adjust

Validation Output Format

code

Scenario: memory-leak
Status: PASSED ✅

Validation Results:
  ✅ Root cause match: Found keywords [OOMKilled, memory, heap]
  ✅ Confidence threshold: 0.89 >= 0.85
  ✅ Remediation alignment: 3/3 suggestions match expected strategies

Agent Diagnosis:
  Root Cause: Memory leak in payment-service causing repeated OOMKills
  Confidence: 0.89
  Evidence:
    - OutOfMemoryError in logs (47 occurrences)
    - Memory usage trending upward to 98% of limit
    - Pod restarted 5 times in last hour
  Remediation:
    - Rollback to v1.5.1 (stable version)
    - Increase memory limit to 2Gi temporarily
    - Profile memory usage to identify leak