AgentSkillsCN

do-test-coverage-audit

进行测试覆盖的法医式审计——全面排查复杂性的来源,并评估测试是否覆盖到了恰当的层级。当需要评估测试质量、识别测试盲区,或审核测试策略时,可使用此技能。同时,该技能还能为“测试建议”技能提供详尽的记录与分析。

SKILL.md
--- frontmatter
name: "do-test-coverage-audit"
description: "Forensic test coverage audit - exhaustive detection of complexity sources and assessment of whether tests exist at the right level. Use when reviewing test quality, identifying testing gaps, or auditing test strategy. Produces detailed accounting for test-recommendations skill."

Test Coverage Audit

Forensic analysis of test coverage quality. Not just "do you have tests" but "are you testing the right things at the right level?"

This skill produces an exhaustive audit report. For recommendations based on this report, use the test-recommendations skill. For implementation planning, use test-implementation-plan.

Philosophy

The Testing Pyramid

code
         ╱╲
        ╱E2E╲          Few, slow, high-confidence
       ╱──────╲
      ╱ Integ  ╲       Medium count, medium speed
     ╱──────────╲
    ╱    Unit    ╲     Many, fast, focused
   ╱──────────────╲

Comprehensive Testing Level Definitions: concepts/testing-levels.md

LevelTestsSpeedScopeConfidence
UnitManyFastSingle function/classLogic correctness
IntegrationMediumMediumComponent boundariesPieces work together
E2EFewSlowFull user journeySystem actually works

Testing at the Right Level

Wrong level → wasted effort, false confidence, or fragile tests

SymptomProblemFix
500 unit tests, login brokenMissing e2eAdd e2e for critical paths
All e2e, CI takes 2 hoursOver-reliance on slow testsPush more to unit/integration
Tests break on every refactorTesting implementation, not behaviorTest contracts, not internals
High coverage, bugs slip throughTesting wrong thingsFocus on user-facing behavior

Common AI/LLM Testing Mistakes

When AI generates tests, it often makes systematic errors. Read: concepts/llm-testing-mistakes.md

MistakeWhat It Looks LikeWhy It's Harmful
Tautological testsexpect(mock).toHaveBeenCalled() after mock()Tests nothing real
Over-mockingEvery dependency mockedTests mocks, not code
Happy path onlyNo error/edge casesMisses real failures
Testing implementationBreaks on refactorFragile, not behavioral

Audit Process

Phase 1: Complexity Source Detection

Goal: Create an exhaustive inventory of everything that needs testing.

1.1 Architecture Detection

Is this a microservices/distributed system?

Read: detection/microservices.md

SignalDetection Method
Docker Composels docker-compose*.yml
Kubernetesfind . -name "*.yaml" | xargs grep "kind: Deployment"
Service URLs in envgrep -E ".*_URL=.*_HOST=" .env*
Multiple repos/servicesDirectory structure analysis

Output:

markdown
### Architecture Classification
- Type: [Monolith | Modular Monolith | Microservices | Serverless]
- Services detected: [list with protocols]
- Inter-service communication: [HTTP | gRPC | Message Queue | None]
- Contract testing present: [Yes/No]

1.2 Data Interaction Detection

What data does this system touch?

Read: detection/data-interactions.md

CategoryDetection
DatabasesGrep for ORM imports, connection strings
CachesGrep for Redis/Memcached clients
File systemGrep for fs/pathlib operations
User configLook for config loading patterns
SecretsCheck for secret manager integrations

Output:

markdown
### Data Interactions
| Category | Technology | Locations | Tested? |
|----------|------------|-----------|---------|
| Database | PostgreSQL/SQLAlchemy | models/*.py | ✅/❌ |
| Cache | Redis | services/cache.py | ✅/❌ |
| Files | S3/boto3 | storage/*.py | ✅/❌ |
| Config | pydantic/settings | config.py | ✅/❌ |
| Secrets | AWS SecretsManager | auth/*.py | ✅/❌ |

1.3 External API Detection

What external services does this call?

Read: detection/external-apis.md

CategoryDetection
HTTP clientsGrep for requests/axios/fetch
Payment SDKsGrep for stripe/paypal
Auth providersGrep for oauth/auth0/cognito
Cloud servicesGrep for boto3/gcloud/azure
WebhooksGrep for webhook endpoints

Output:

markdown
### External API Integrations
| Service | SDK/Client | Criticality | Error Handling? | Tested? |
|---------|------------|-------------|-----------------|---------|
| Stripe | stripe-python | Critical | ⚠️ Partial | ❌ |
| SendGrid | sendgrid | High | ❌ None | ❌ |
| Auth0 | auth0-python | Critical | ✅ Yes | ✅ |

1.4 Interactive/User Input Detection

Does this require user interaction for testing?

Read: concepts/interactive-testing.md

PatternTesting Approach
CLI promptsPTY/pexpect testing
Shell completionsCompletion script testing
TUI (full-screen)Virtual terminal (pyte)
Desktop GUIPlatform-specific (Playwright/XCTest)
Device-specificHardware test farms or mocks

Output:

markdown
### Interactive Components
| Component | Type | Can Test in CI? | Current Approach |
|-----------|------|-----------------|------------------|
| Setup wizard | CLI prompts | ✅ (pexpect) | ❌ Untested |
| Tab completion | Shell integration | ✅ (script) | ❌ Untested |
| Dashboard | Full-screen TUI | ⚠️ (with pyte) | ❌ Untested |

Phase 2: Detect Project Type & Language

2.1 Project Type Detection

Identify the scenario to set testing expectations:

SignalProject TypeScenario Reference
bin/, CLI entry point, argparseCLI Toolscenarios/cli.md
React/Vue/Angular, pages/, components/Web Frontendscenarios/web-frontend.md
Express/FastAPI/Rails, routes/Web Backend/APIscenarios/web-backend.md
Both frontend + backendFull Stackscenarios/fullstack.md
npm package, library exportsLibrary/SDKscenarios/library.md
iOS/Android, mobile frameworksMobile Appscenarios/mobile.md
Dockerfile, k8s manifests, terraformInfrastructurescenarios/infrastructure.md
agents/, prompts/, LLM callsAI/Agent Systemscenarios/ai-agents.md
Airflow DAGs, Spark jobs, ETLData Pipelinescenarios/data-pipelines.md
Kafka, WebSockets, real-time streamsReal-time Systemscenarios/realtime-systems.md
Firmware, HAL, microcontrollersEmbedded/IoTscenarios/embedded-iot.md
Electron, Qt, WPF, native GUIDesktop Appscenarios/desktop-apps.md
manifest.json, Chrome/Firefox extensionBrowser Extensionscenarios/browser-extensions.md
Unity, Unreal, game engineGame Developmentscenarios/game-development.md
Solidity, smart contracts, Web3Blockchain/Web3scenarios/blockchain.md

Read the appropriate scenario file for testing expectations specific to that project type.

2.2 Language/Framework Detection

Phases 2-5: Forensic Analysis

For the detailed forensic analysis (test inventory, coverage mapping, quality assessment, gap analysis), spawn the test-auditor agent:

Use the Task tool to spawn do:test-auditor agent:

code
Execute forensic test coverage analysis.

Project: [current working directory]
Framework: [detected framework from Phase 1]
Intensity: [quick|medium|thorough]

Run phases 2-5:
- Phase 3: Test Inventory
- Phase 4: Coverage Mapping
- Phase 5: Quality Assessment
- Phase 6: Gap Analysis

Output: TEST-AUDIT-<timestamp>.md in .agent_planning/

The agent will complete the audit report with all remaining phases.


Output Format

The audit produces a comprehensive report:

markdown
# Test Coverage Audit Report
**Project**: [name]
**Date**: [date]
**Auditor**: Claude

## Executive Summary
**Overall Health**: [Healthy | Needs Work | Critical Gaps]
**Architecture**: [type]
**Coverage Distribution**: Unit n% | Integration n% | E2E n%
**Critical Issues**: [count]

---

## 1. Architecture Analysis
[From Phase 1.1]

## 2. Complexity Source Inventory
### 2.1 Data Interactions
[From Phase 1.2]

### 2.2 External APIs
[From Phase 1.3]

### 2.3 Interactive Components
[From Phase 1.4]

---

## 3. Test Inventory
[From Phase 3]

---

## 4. Coverage Matrix
[From Phase 4]

---

## 5. Quality Assessment
### 5.1 Red Flags Detected
[From Phase 5.1]

### 5.2 Quality Checklist Results
[From Phase 5.2]

---

## 6. Gap Analysis

### P0 - Critical (Must Fix)
[From Phase 6]

### P1 - Significant (Should Fix)
[From Phase 6]

### P2 - Minor (Nice to Have)
[From Phase 6]

---

## 7. Risk Assessment

| Risk | Impact | Likelihood | Current Mitigation |
|------|--------|------------|-------------------|
| Payment failures undetected | High | Medium | ❌ None |
| Auth bypass possible | Critical | Low | ⚠️ Partial |

---

## 8. Appendix

### A. Files Analyzed
[List of all files examined]

### B. Test File Inventory
[Complete list of test files]

### C. Detection Commands Used
[Commands run during audit]

Intensity Levels

LevelScopeDepth
QuickArchitecture + high-level gaps10-15 min
Medium+ Quality assessment + coverage matrix30-45 min
Thorough+ Test-by-test review + risk analysis60-90 min

Reference Documents

Concepts

TopicReference
Testing levels definedconcepts/testing-levels.md
AI/LLM testing mistakesconcepts/llm-testing-mistakes.md
Interactive system testingconcepts/interactive-testing.md
Unknown UI testingconcepts/unknown-ui-testing.md

Detection

AreaReference
Microservices detectiondetection/microservices.md
Data interaction detectiondetection/data-interactions.md
External API detectiondetection/external-apis.md

Scenarios (15)

Languages (6)


Integration

This skill is invoked as a dimension of /do:plan audit:

  • Trigger: "audit tests", "test coverage audit", "testing audit"
  • Can run alongside other audit dimensions

Related Skills

SkillPurpose
test-recommendationsGenerate strategic test plan from audit
test-implementation-planCreate execution plan with testability refactoring
do:add-testsWrite specific tests
do:setup-testingSet up test framework
do:tdd-workflowTest-first development