Generate Tests
Use this skill when the user wants to create test cases for their AI agent or skill without writing YAML by hand.
Four approaches
1. Generate tests from a SKILL.md file
Use the generate_skill_tests MCP tool to auto-generate a test suite from a skill definition. This reads the SKILL.md and produces YAML test cases covering explicit triggers, implicit triggers, contextual triggers, and negative cases.
Steps:
- •Ask the user which SKILL.md to generate tests for (or detect it from context).
- •Call
generate_skill_testswith:- •
skill_path: path to the SKILL.md file - •
output_path(optional): where to save the generated YAML - •
count(optional): number of test cases (default: 10)
- •
- •After generation, offer to run the tests with
run_skill_test.
CLI equivalent:
evalview skill generate-tests .claude/skills/my-skill/SKILL.md --auto evalview skill generate-tests .claude/skills/my-skill/SKILL.md -c 20 -o tests/my-skill-tests.yaml
2. Create individual test cases manually
Use the create_test MCP tool to create a single test YAML file from a description.
Steps:
- •Gather from the user: test name, query, expected tools, forbidden tools, expected output keywords, and minimum score.
- •Call
create_testwith the parameters. - •After creating the test, call
run_snapshotto establish the golden baseline.
3. Capture real interactions
Use the CLI evalview capture command to proxy real agent traffic and save interactions as test YAMLs automatically. This records the query, output, and tool calls from live usage.
CLI equivalent:
evalview capture --agent http://localhost:8080/execute --output-dir tests/test-cases evalview capture --multi-turn # saves all turns as one multi-turn conversation test
4. Validate a skill before testing
Use validate_skill to check a SKILL.md for correct structure and completeness before generating tests from it.
Running generated tests
After generating tests, execute them with run_skill_test:
- •
test_file: path to the generated YAML - •
no_rubric: truefor fast deterministic-only checks (no LLM cost) - •
verbose: truefor detailed output on all tests
CLI equivalent:
evalview skill test tests/my-skill-tests.yaml evalview skill test tests/my-skill-tests.yaml --no-rubric # fast, $0 evalview skill test tests/my-skill-tests.yaml --verbose --model claude-sonnet-4-20250514