AgentSkillsCN

test-plan

通过交叉引用现有测试与规格文档,生成并维护测试计划。读取规格文件以提取可测试的行为,扫描现有测试文件以确认已覆盖的内容,识别测试盲区,并生成优先级分明的测试计划。在完成“规格审查”后生成可执行的测试工作,或定期进行测试覆盖度的追踪与调整。

SKILL.md
--- frontmatter
name: test-plan
description: >
  Generate and maintain test plans by cross-referencing specs with existing tests.
  Reads spec files to extract testable behaviors, scans existing test files to find
  what's covered, identifies gaps, and produces a prioritized test plan. Use after
  /spec-review to generate actionable test work, or periodically to catch coverage drift.

Test Plan

Generate a test plan by cross-referencing specs with existing tests.

When to Use

  • After /spec-review — clean specs are ready to derive tests from
  • After implementing new features — verify test coverage matches spec
  • Before a release — ensure no spec behaviors are untested
  • Periodic audit — catch coverage drift

When NOT to Use

  • Spec quality is poor — run /spec-review first
  • Writing tests — this skill produces the plan, not the test code
  • Benchmark analysis — use /analyze-report instead

Inputs

  • No argument: analyze ALL specs and tests in the project
  • File path: analyze a specific spec file's test coverage
  • --priority: sort by risk (default: sort by spec order)

Workflow

Phase 1: Discovery

  1. Read CLAUDE.md for project structure — find spec directory, test directories, source directories
  2. Glob {spec-dir}/**/*.md for all spec files
  3. Glob {test-dirs}/**/*test*.cpp and {test-dirs}/**/*_test.cpp for all test files
  4. Glob {test-dirs}/**/*_bench.cpp for all benchmark files

Phase 2: Extract Testable Behaviors from Specs

For each spec file, extract:

  • Behavioral requirements: "X shall do Y when Z" — each maps to >= 1 test case
  • Boundary conditions: capacity limits, overflow handling, edge cases
  • Performance targets: latency budgets, throughput claims — each maps to a benchmark
  • Error handling: what happens on invalid input, timeout, disconnect
  • Integration points: how components interact across boundaries

Output as a structured list:

code
{spec-file}:{line} — {behavior description}
  Type: behavioral | boundary | performance | error | integration
  Testable: yes | no (reason)

Phase 3: Map Existing Tests

For each test file, extract:

  • Test names (TEST, TEST_F, TEST_P macros)
  • What behavior each test verifies (read test body, match to spec behaviors)
  • Which spec section it corresponds to (if identifiable from naming or comments)

For each benchmark file, extract:

  • What performance claim it measures
  • Which spec section it corresponds to

Phase 4: Gap Analysis

Cross-reference Phase 2 (what SHOULD be tested) with Phase 3 (what IS tested):

  • Covered: spec behavior has matching test(s) — report as covered
  • Partially covered: test exists but doesn't check all branches/boundaries
  • Uncovered: spec behavior has no corresponding test — GAP
  • Orphaned tests: test doesn't map to any spec behavior — may be stale or spec is incomplete
  • Ungrounded perf claims: spec states a number, no benchmark exists

Phase 5: Report

markdown
# Test Plan Report

**Scope**: [all specs | specific file]
**Date**: YYYY-MM-DD HH:MM

## Summary

| Status | Count |
|--------|-------|
| Covered | N |
| Partially covered | N |
| Uncovered (GAP) | N |
| Orphaned tests | N |
| Ungrounded perf | N |

## Coverage by Spec File

### {spec-file}

| Line | Behavior | Status | Test File | Notes |
|------|----------|--------|-----------|-------|
| 42 | Element poll_once returns... | Covered | core_test.cpp:TestPollOnce | |
| 58 | Timer fires within... | GAP | — | needs benchmark |

## Gaps (prioritized)

### High Priority (behavioral, no test at all)

1. **{spec-file}:{line}** — {behavior}
   Suggested test: {test name and what to assert}

### Medium Priority (boundary/error, no test)

1. ...

### Low Priority (partially covered)

1. ...

## Orphaned Tests

1. **{test-file}:{test-name}** — no matching spec behavior found

Principles

  • Spec drives tests: Every test should trace back to a spec behavior. Tests without spec backing may be testing implementation details.
  • AI reads both sides: Don't use hardcoded mappings. Read specs and test code dynamically.
  • Prioritize by risk: Untested behavioral requirements > untested boundaries > untested performance > partial coverage.
  • No false coverage: A test that shares a name with a spec section but doesn't actually verify the behavior is NOT coverage.
  • Benchmark = performance test: Performance claims in spec need benchmarks, not just unit tests.