Analyzing Test Quality
You are an expert in test quality analysis with deep knowledge of testing principles, patterns, and metrics that apply across all testing frameworks.
Your Capabilities
- •Quality Metrics: Coverage, mutation score, test effectiveness
- •Test Patterns: AAA, GWT, fixtures, factories, page objects
- •Anti-Patterns: Flaky tests, test pollution, over-mocking
- •Maintainability: DRY, readability, test organization
- •Reliability: Determinism, isolation, independence
- •Coverage Analysis: Statement, branch, function, line coverage
When to Use This Skill
Claude should automatically invoke this skill when:
- •The user asks about test quality or test effectiveness
- •Code coverage reports or metrics are discussed
- •Test reliability or flakiness is mentioned
- •Test organization or refactoring is needed
- •General test improvement is requested
How to Use This Skill
Accessing Resources
Use {baseDir} to reference files in this skill directory:
- •Scripts:
{baseDir}/scripts/ - •Documentation:
{baseDir}/references/ - •Templates:
{baseDir}/assets/
Available Resources
This skill includes ready-to-use resources in {baseDir}:
- •references/quality-checklist.md - Printable test quality checklist with scoring guide
- •assets/quality-report.template.md - Complete template for test quality assessment reports
- •scripts/calculate-metrics.sh - Calculates test metrics (test count, ratios, patterns, assertions)
Test Quality Dimensions
1. Correctness
Tests accurately verify intended behavior:
- •Tests match requirements
- •Assertions are complete
- •Edge cases are covered
- •Error scenarios are tested
2. Readability
Tests are easy to understand:
- •Clear naming (what is being tested)
- •Proper structure (AAA/GWT pattern)
- •Minimal setup noise
- •Self-documenting code
3. Maintainability
Tests are easy to modify:
- •DRY with appropriate helpers
- •Focused tests (single responsibility)
- •Proper abstraction level
- •Clear dependencies
4. Reliability
Tests produce consistent results:
- •No timing dependencies
- •Proper isolation
- •Deterministic data
- •Independent execution
5. Speed
Tests run efficiently:
- •Appropriate test pyramid
- •Efficient setup/teardown
- •Proper mocking strategy
- •Parallel execution
Test Quality Checklist
Structure
- • Uses AAA (Arrange-Act-Assert) or GWT pattern
- • One logical assertion per test
- • Descriptive test names
- • Proper describe/context nesting
- • Appropriate setup/teardown
Coverage
- • Happy path scenarios
- • Error/edge cases
- • Boundary conditions
- • Integration points
- • Security scenarios
Reliability
- • No timing dependencies
- • Proper async handling
- • Isolated tests (no shared state)
- • Deterministic data
- • Order-independent
Maintainability
- • Reusable fixtures/factories
- • Clear variable naming
- • Focused assertions
- • Appropriate abstraction
- • No magic numbers/strings
Common Anti-Patterns
Test Pollution
// BAD: Shared mutable state
let count = 0;
beforeEach(() => count++);
// GOOD: Reset in setup
let count: number;
beforeEach(() => { count = 0; });
Over-Mocking
Mocking too much hides bugs and makes tests brittle.
// BAD: Mock everything - test only verifies mocks
// Jest
jest.mock('./dep1');
jest.mock('./dep2');
jest.mock('./dep3');
// Vitest
vi.mock('./dep1');
vi.mock('./dep2');
vi.mock('./dep3');
// GOOD: Mock boundaries only
// Mock external services, keep internal logic real
mock('./api'); // External service only
// Test actual business logic
Flaky Assertions
// BAD: Timing dependent await delay(100); expect(element).toBeVisible(); // GOOD: Wait for condition // Testing Library await waitFor(() => expect(element).toBeVisible()); // Playwright await expect(element).toBeVisible();
Mystery Guest
// BAD: Hidden dependencies
test('should process', () => {
const result = process(); // Uses global data
expect(result).toBe(42);
});
// GOOD: Explicit setup
test('should process input', () => {
const input = createInput({ value: 21 });
const result = process(input);
expect(result).toBe(42);
});
Assertion Roulette
// BAD: Multiple unrelated assertions
test('should work', () => {
expect(user.name).toBe('John');
expect(items.length).toBe(3);
expect(total).toBe(100);
});
// GOOD: Focused assertions
test('should set user name', () => {
expect(user.name).toBe('John');
});
test('should have correct item count', () => {
expect(items).toHaveLength(3);
});
Mutation Testing
Mutation testing validates test effectiveness by modifying code and checking if tests catch the changes.
Concept
- •Mutants are created by modifying source code (changing operators, values, etc.)
- •Tests run against each mutant
- •Killed mutants = tests caught the change (good!)
- •Survived mutants = tests missed the change (weak tests)
Stryker Setup
# Install Stryker npm install -D @stryker-mutator/core # For specific frameworks npm install -D @stryker-mutator/jest-runner # Jest npm install -D @stryker-mutator/vitest-runner # Vitest npm install -D @stryker-mutator/mocha-runner # Mocha # Initialize configuration npx stryker init
Stryker Configuration
// stryker.conf.js
module.exports = {
packageManager: 'npm',
reporters: ['html', 'clear-text', 'progress'],
testRunner: 'jest',
coverageAnalysis: 'perTest',
// What to mutate
mutate: [
'src/**/*.ts',
'!src/**/*.test.ts',
'!src/**/*.spec.ts',
],
// Mutation types to use
mutator: {
excludedMutations: [
'StringLiteral', // Skip string mutations
],
},
// Thresholds
thresholds: {
high: 80,
low: 60,
break: 50, // Fail CI if below this
},
};
Interpreting Results
Mutation score: 85% Killed: 170 | Survived: 30 | Timeout: 5 | No coverage: 10
High score (>80%): Tests are effective Medium score (60-80%): Some weak areas Low score (<60%): Tests need significant improvement
Common Surviving Mutations
Boundary mutations: < changed to <=
// Mutation survives if tests don't check boundary
if (value < 10) { ... } // Changed to: value <= 10
Arithmetic mutations: + changed to -
// Mutation survives if result isn't precisely checked return a + b; // Changed to: a - b
Boolean mutations: && changed to ||
// Mutation survives if both conditions aren't tested
if (a && b) { ... } // Changed to: a || b
CI Integration
# GitHub Actions
- name: Run mutation tests
run: npx stryker run
- name: Upload Stryker report
uses: actions/upload-artifact@v3
with:
name: stryker-report
path: reports/mutation/
Coverage Metrics
Types of Coverage
- •Statement: Lines executed
- •Branch: Decision paths taken
- •Function: Functions called
- •Line: Lines covered
Coverage Thresholds
// Recommended minimums
{
statements: 80,
branches: 75,
functions: 80,
lines: 80
}
Coverage Pitfalls
- •High coverage ≠ good tests
- •Can miss logical errors
- •Doesn't test interactions
- •Can incentivize bad tests
Mutation Testing
Concept
Mutation testing modifies code to check if tests catch the changes:
- •Tests should fail when code is mutated
- •Surviving mutants indicate weak tests
- •Higher kill rate = better tests
Types of Mutations
- •Arithmetic operators (+, -, *, /)
- •Comparison operators (<, >, ==)
- •Boolean operators (&&, ||, !)
- •Return values
- •Constants
Test Pyramid
Unit Tests (Base)
- •Fast execution
- •Isolated components
- •High coverage
- •Many tests
Integration Tests (Middle)
- •Component interactions
- •Database/API calls
- •Moderate coverage
- •Medium quantity
E2E Tests (Top)
- •Full user flows
- •Real browser
- •Critical paths only
- •Few tests
Analysis Workflow
When analyzing test quality:
- •
Gather Metrics
- •Run coverage report
- •Count test/code ratio
- •Measure test execution time
- •
Identify Patterns
- •Check test structure
- •Look for anti-patterns
- •Assess naming quality
- •
Evaluate Reliability
- •Check for flaky indicators
- •Assess isolation
- •Review async handling
- •
Provide Recommendations
- •Prioritize by impact
- •Give specific examples
- •Include code samples
Examples
Example 1: Coverage Analysis
When analyzing coverage:
- •Run coverage tool
- •Identify uncovered lines
- •Prioritize critical paths
- •Suggest test cases
Example 2: Reliability Audit
When auditing for reliability:
- •Search for timing patterns
- •Check shared state usage
- •Review async assertions
- •Identify order dependencies
Important Notes
- •Quality is more important than quantity
- •Coverage is a starting point, not a goal
- •Fast feedback enables TDD
- •Readable tests serve as documentation
- •Test maintenance cost should be low