AgentSkillsCN

testing-patterns

基于代理的声明式测试,采用 YAML 测试规范。测试会在子代理中运行,在执行多项测试的同时,有效保留主上下文。支持 MCP 服务器、API 以及浏览器自动化测试。 适用于:测试 MCP 服务器、运行集成测试、在变更后验证工具行为,或构建回归测试套件时使用。关键词:YAML 测试、代理测试、MCP 测试、集成测试。

SKILL.md
--- frontmatter
name: testing-patterns
description: |
  Agent-based declarative testing with YAML test specs. Tests run in sub-agents to preserve
  main context while executing many tests. Supports MCP servers, APIs, and browser automation.

  Use when: testing MCP servers, running integration tests, validating tool behavior after changes,
  or creating regression test suites. Keywords: yaml tests, agent testing, mcp test, integration tests.
license: MIT

Testing Patterns

A pragmatic approach to testing that emphasises:

  • Live testing over mocks
  • Agent execution to preserve context
  • YAML specs as documentation and tests
  • Persistent results committed to git

Philosophy

This is not traditional TDD. Instead:

  1. Test in production/staging with good logging
  2. Use agents to run tests (keeps main context clean)
  3. Define tests declaratively in YAML (human-readable, version-controlled)
  4. Focus on integration (real servers, real data)

Why Agent-Based Testing?

Running 50 tests in the main conversation would consume your entire context window. By delegating to a sub-agent:

  • Main context stays clean for development
  • Agent can run many tests without context pressure
  • Results come back as a summary
  • Failed tests get detailed investigation

Commands

CommandPurpose
/create-testsDiscover project, generate test specs + testing agent
/run-testsExecute tests via agent(s), report results
/coverageGenerate coverage report and identify uncovered code paths

Quick workflow:

code
/create-tests        → Generates tests/specs/*.yaml + .claude/agents/test-runner.md
/run-tests           → Spawns agent, runs all tests, saves results
/run-tests api       → Run only specs matching "api"
/run-tests --failed  → Re-run only failed tests
/coverage            → Run tests with coverage, analyse gaps
/coverage --threshold 80  → Fail if below 80%

Getting Started in a New Project

This skill provides the pattern and format. Claude designs the actual tests based on your project context.

What happens when you ask "Create tests for this project":

  1. Discovery - Claude examines the project:

    • What MCP servers are configured?
    • What APIs or tools exist?
    • What does the code do?
  2. Test Design - Claude creates project-specific tests:

    • Test cases for the actual tools/endpoints
    • Expected values based on real behavior
    • Edge cases relevant to this domain
  3. Structure - Using patterns from this skill:

    • YAML specs in tests/ directory
    • Optional testing agent in .claude/agents/
    • Results saved to tests/results/

Example:

code
You: "Create tests for this MCP server"

Claude: [Discovers this is a Google Calendar MCP]
        [Sees tools: calendar_events, calendar_create, calendar_delete]
        [Designs test cases:]

        tests/calendar-events.yaml:
        - list_upcoming_events (expect: array, count_gte 0)
        - search_by_keyword (expect: contains search term)
        - invalid_date_range (expect: error status)

        tests/calendar-mutations.yaml:
        - create_event (expect: success, returns event_id)
        - delete_nonexistent (expect: error, contains "not found")

The skill teaches Claude:

  • How to structure YAML test specs
  • What validation rules are available
  • How to create testing agents
  • When to use parallel execution

Your project provides:

  • What to actually test
  • Expected values and behaviors
  • Domain-specific edge cases

YAML Test Spec Format

yaml
name: Feature Tests
description: What these tests validate

# Optional: defaults applied to all tests
defaults:
  tool: my_tool_name
  timeout: 5000

tests:
  - name: test_case_name
    description: Human-readable purpose
    tool: tool_name  # Override default if needed
    params:
      action: search
      query: "test input"
    expect:
      contains: "expected substring"
      not_contains: "should not appear"
      status: success

Validation Rules

RuleDescriptionExample
containsResponse contains stringcontains: "from:john"
not_containsResponse doesn't containnot_contains: "error"
matchesRegex pattern matchmatches: "after:\\d{4}"
json_pathCheck value at JSON pathjson_path: "$.results[0].name"
equalsExact value matchequals: "success"
statusCheck success/errorstatus: success
count_gteArray length >= Ncount_gte: 1
count_eqArray length == Ncount_eq: 5
typeValue type checktype: array

See references/validation-rules.md for complete documentation.

Creating a Testing Agent

Testing agents inherit MCP tools from the session. Create an agent that:

  1. Reads YAML test specs
  2. Executes tool calls with params
  3. Validates responses against expectations
  4. Reports results

Agent Template

CRITICAL: Do NOT specify a tools field if you need MCP access. When you specify ANY tools, it becomes an allowlist and "*" is interpreted literally (not as a wildcard). Omit tools entirely to inherit ALL tools from the parent session.

yaml
---
name: my-tester
description: |
  Tests [domain] functionality. Reads YAML test specs and validates responses.
  Use when: testing after changes, running regression tests.
# tools field OMITTED - inherits ALL tools from parent (including MCP)
model: sonnet
---

# [Domain] Tester

## How It Works

1. Find test specs: `tests/*.yaml`
2. Parse and execute each test
3. Validate responses
4. Report pass/fail summary

## Test Spec Location

tests/
├── feature-a.yaml
├── feature-b.yaml
└── results/
    └── YYYY-MM-DD-HHMMSS.md

## Execution

For each test:
1. Call tool with params
2. Capture response
3. Apply validation rules
4. Record PASS/FAIL

## Reporting

Save results to `tests/results/YYYY-MM-DD-HHMMSS.md`

See templates/test-agent.md for complete template.

Results Format

Test results are saved as markdown for git history:

markdown
# Test Results: feature-name
**Date**: 2026-02-02 14:30
**Commit**: abc1234
**Summary**: 8/9 passed (89%)

## Results

- test_basic_search - PASSED (0.3s)
- test_with_filter - PASSED (0.4s)
- test_edge_case - FAILED

## Failed Test Details

### test_edge_case
- **Expected**: Contains "expected value"
- **Actual**: Response was empty
- **Params**: `{ action: search, query: "" }`

Save to: tests/results/YYYY-MM-DD-HHMMSS.md

Workflow

1. Create Test Specs

yaml
# tests/search.yaml
name: Search Tests
defaults:
  tool: my_search_tool

tests:
  - name: basic_search
    params: { query: "hello" }
    expect: { status: success, count_gte: 0 }

  - name: filtered_search
    params: { query: "hello", filter: "recent" }
    expect: { contains: "results" }

2. Create Testing Agent

Copy templates/test-agent.md and customise for your domain.

3. Run Tests

code
"Run the search tests"
"Test the API after my changes"
"Run regression tests for gmail-mcp"

4. Review Results

Results saved to tests/results/. Commit them for history:

bash
git add tests/results/
git commit -m "Test results: 8/9 passed"

Parallel Test Execution

Run multiple test agents simultaneously to speed up large test suites:

code
"Run these test suites in parallel:
- Agent 1: tests/auth/*.yaml
- Agent 2: tests/search/*.yaml
- Agent 3: tests/api/*.yaml"

Each agent:

  • Has its own context (won't bloat main conversation)
  • Can run 10-50 tests independently
  • Returns a summary when done
  • Inherits MCP tools from parent session

Why parallel agents?

  • 50 tests in main context = context exhaustion
  • 50 tests across 5 agents = clean context + faster execution
  • Each agent reports pass/fail summary, not every test detail

Batching strategy:

  • Group tests by feature area or MCP server
  • 10-20 tests per agent is ideal
  • Too few = overhead of spawning not worth it
  • Too many = agent context fills up

MCP Testing

For MCP servers, the testing agent inherits configured MCPs:

bash
# Configure MCP first
claude mcp add --transport http gmail https://gmail.mcp.example.com/mcp

# Then test
"Run tests for gmail MCP"

Example MCP test spec:

yaml
name: Gmail Search Tests
defaults:
  tool: gmail_messages

tests:
  - name: search_from_person
    params: { action: search, searchQuery: "from John" }
    expect: { contains: "from:john" }

  - name: search_with_date
    params: { action: search, searchQuery: "emails from January 2026" }
    expect: { matches: "after:2026" }

API Testing

For REST APIs, use Bash tool:

yaml
name: API Tests
defaults:
  timeout: 5000

tests:
  - name: health_check
    command: curl -s https://api.example.com/health
    expect: { contains: "ok" }

  - name: get_user
    command: curl -s https://api.example.com/users/1
    expect:
      json_path: "$.name"
      type: string

Browser Testing

For browser automation, use Playwright tools:

yaml
name: UI Tests

tests:
  - name: login_page_loads
    steps:
      - navigate: https://app.example.com/login
      - snapshot: true
    expect: { contains: "Sign In" }

  - name: form_submission
    steps:
      - navigate: https://app.example.com/form
      - type: { ref: "#email", text: "test@example.com" }
      - click: { ref: "button[type=submit]" }
    expect: { contains: "Success" }

Tips

  1. Start with smoke tests: Basic connectivity and auth
  2. Test edge cases: Empty results, errors, special characters
  3. Use descriptive names: search_with_date_filter not test1
  4. Group related tests: One file per feature area
  5. Add after bugs: Every fixed bug gets a regression test
  6. Commit results: Create history of test runs

What This Is NOT

  • Not a Jest/Vitest replacement (use those for unit tests)
  • Not enforcing TDD (use what works for you)
  • Not a test runner library (the agent IS the runner)
  • Not about mocking (we test real systems)

When to Use

ScenarioUse ThisUse Traditional Testing
MCP server validationYesNo
API integrationYesComplement with unit tests
Browser workflowsYesComplement with component tests
Unit testingNoYes (Jest/Vitest)
Component testingNoYes (Testing Library)
Type checkingNoYes (TypeScript)

Related Resources

  • templates/test-spec.yaml - Generic test spec template
  • templates/test-agent.md - Testing agent template
  • references/validation-rules.md - Complete validation rule reference