AgentSkillsCN

testing-strategies

系统化测试,既确保信心又避免过度测试——在恰当的层级实施恰到好处的测试。

SKILL.md
--- frontmatter
name: "testing-strategies"
description: "Systematic testing for confidence without over-testing — the right test at the right level"
applyTo: "**/*test*,**/*spec*,**/*.test.*,**/*.spec.*"

Testing Strategies Skill

Test the behavior, not the implementation. Test the boundaries, not the happy path.

Testing Pyramid

LevelVolumeSpeedCost to MaintainWhat It Catches
UnitMany (70%)< 10ms eachLowLogic errors, edge cases, regressions
IntegrationSome (20%)< 1s eachMediumWiring bugs, API contracts, data flow
E2EFew (10%)5-30s eachHighUser journey failures, deployment issues

Anti-pattern: Inverted pyramid (too many E2E, few unit) → slow CI, flaky tests, hard to debug. Anti-pattern: Ice cream cone (manual testing on top of everything) → doesn't scale.

Unit Test Pattern (AAA)

typescript
test('should calculate discount when order exceeds $100', () => {
    // Arrange
    const order = createOrder({ subtotal: 150, customerTier: 'gold' });
    
    // Act
    const discount = calculateDiscount(order);
    
    // Assert
    expect(discount).toBe(15); // 10% for gold tier
});

Naming convention: should [expected behavior] when [condition] — reads as a specification.

Test Types Beyond the Pyramid

TypePurposeWhen to UseExample
SnapshotDetect unexpected output changesUI components, serialized dataexpect(render(<Button/>)).toMatchSnapshot()
ContractVerify API shape between servicesMicroservices, public APIsPact, OpenAPI validation
Property-basedFind edge cases humans missPure functions, parsers, serializersfc.assert(fc.property(fc.string(), s => decode(encode(s)) === s))
MutationVerify tests actually catch bugsCritical business logicStryker, pitest
PerformanceCatch regressions in speed/memoryHot paths, API endpointsBenchmark before/after
SmokeVerify deployment didn't break basicsPost-deploy, stagingHit health endpoint + key pages

What to Mock (and What Not To)

Mock ThisWhyDon't Mock ThisWhy
External HTTP APIsUnreliable, slow, costlyYour own business logicYou'd be testing your mocks
Database in unit testsSlow, statefulDatabase in integration testsThat's the whole point
Time (Date.now)Non-deterministicPure functionsAlready deterministic
File systemSide effectsIn-memory equivalentsFaster than mocking
Random/UUIDNon-deterministicFramework internalsNot your responsibility

Coverage Philosophy

RangeInterpretationAction
< 50%Probably missing critical pathsIncrease
50-70%Reasonable for most projectsFocus on changed code
70-85%Good, diminishing returns startingMaintain, don't chase
85-100%Often wasteful unless safety-criticalReview if the effort is worth it

The real metric: Coverage of changed code in each PR, not overall percentage.

What coverage doesn't tell you: That your tests assert the right things. 100% coverage with no assertions is useless.

What NOT to Test

Don't TestWhyInstead
Third-party library internalsNot your codeTrust it (or pick a different library)
Framework behaviorAlready tested upstreamTest your code that uses the framework
Private implementation detailsBreaks on refactorTest the public interface
Trivial getters/settersNo logic to breakOnly if they have side effects
Generated codeChanges on regenerationTest the generator, not the output

Test Quality Signals

Good TestBad Test
Fails when behavior breaksFails when implementation changes
One clear reason to failMultiple assertions testing different things
Self-containedDepends on other test order
Fast (< 100ms unit)Slow due to unnecessary setup
Readable as documentationRequires reading source to understand
DeterministicFlaky (passes sometimes)

TDD Cycle

StepActionCommon Mistake
RedWrite a failing testWriting too much test (test the next small behavior)
GreenMake it pass with minimal codeOver-engineering the solution
RefactorClean up while greenSkipping this step (accumulates debt)

TDD is not always the right choice: It works best for well-understood requirements. For exploratory code, write tests after the design stabilizes.

Flaky Test Triage

PatternLikely CauseFix
Fails 1 in 10 runsTiming/race conditionAdd proper waits, remove shared state
Fails only in CIEnvironment differencePin versions, use containers
Fails after another testTest pollutionIsolate setup/teardown
Fails on slow machinesHardcoded timeoutsUse retry with backoff or event-based waits

Synapses

See synapses.json for connections.