Test Planning Skill
Guidelines for planning, implementing, and fixing tests at component and container level within container boundaries.
Overview
This skill provides guidance for the test-focused pipeline:
- •
/plan-test-setup→ Analyze specs, plan tests - •
/implement-test-setup→ Implement planned tests - •
/fix-test-setup→ Fix failing tests
All tests are scoped to container boundaries only - no cross-container or external service tests.
Test Prioritization Philosophy
CRITICAL: Tests are prioritized by their importance to validate business logic fulfillment:
Priority Hierarchy (Highest to Lowest)
| Priority | Test Type | Purpose |
|---|---|---|
| P0 | Container Responsibility Tests | Validate container fulfills its assigned responsibilities from input to expected output |
| P0 | End-to-End Flow Tests | Test complete data paths through the container (entry point → processing → output) |
| P1 | Business Logic Integration Tests | Tests covering large chunks of business functionality aligned with implementation |
| P1 | Critical Path Tests | Core business logic that must work for the container to be useful |
| P2 | Error Handling Tests | Graceful failure handling and error propagation |
| P3 | Component Unit Tests | Individual class tests (can be OMITTED at container level when covered by integration) |
Design Principles
- •Black Box Validation: Tests should validate the container works as a black box - external testers should NOT worry about internal integration issues
- •Input-to-Output Testing: Prioritize tests that exercise the complete flow from data entry to expected output
- •Business Logic First: Tests should be structured around business responsibilities, not technical components
- •Integration Over Unit: At container level, integration tests that cover multiple components are more valuable than isolated unit tests
- •Unit Tests Optional: Unit tests can be omitted at container level if the same logic is covered by integration tests
Core Principle: Container Boundary Scope
All tests must be runnable within a single container without external dependencies.
In Scope (Prioritized)
| Priority | Test Type | Description | Allowed Dependencies |
|---|---|---|---|
| P0 | Container Responsibility Tests | Validates each container responsibility end-to-end | Mocked external services |
| P0 | Input-to-Output Flow Tests | Complete data path from entry to output | Mocked external services |
| P1 | Business Logic Integration | Tests across components within container | Same container only |
| P1 | Critical Path Tests | Core functionality tests | Same container only |
| P2 | Error Handling Tests | Graceful failure handling | Same container only |
| P3 | Unit Tests | Individual class/function tests | Mocked internal deps only |
| P3 | Infrastructure Tests | Startup, health checks | Container only |
Out of Scope (→ /e2e-test)
| Test Type | Why Out of Scope |
|---|---|
| Cross-Container | Requires other containers |
| Database Integration | Requires real database |
| External API | Requires live services |
| E2E Tests | Requires full system |
Test ID Conventions
Use consistent IDs for traceability:
| Type | Format | Example | Priority | Used In |
|---|---|---|---|---|
| Responsibility Test | RT-{nnn} | RT-001 | P0 | Container responsibility validation |
| Flow Test | FT-{nnn} | FT-001 | P0 | Input-to-output flow tests |
| Business Logic Test | BL-{nnn} | BL-001 | P1 | Business logic integration |
| Error Handling Test | EH-{nnn} | EH-001 | P2 | Error scenario tests |
| Unit Test | UT-{nnn} | UT-001 | P3 | Component tests |
| Infrastructure Test | INF-{nnn} | INF-001 | P3 | Startup, health checks |
| Fixture | FIX-{nnn} | FIX-001 | - | Test infrastructure |
| Integration Test (legacy) | IT-{nnn} | IT-001 | P1 | Component integration |
| Container Integration (legacy) | CIT-{nnn} | CIT-001 | varies | Container tests |
| External Integration | EIT-{nnn} | EIT-001 | - | Out of scope marker |
Test Plan Structure (.test-plan.md)
Component Test Plan
markdown
# Test Plan - {Component Name}
**Generated**: {timestamp}
**Component**: {container}/{component}
**Status**: {Not Started | In Progress | Complete}
## Summary
| Metric | Count |
|--------|-------|
| Specified Unit Tests | {n} |
| Existing Unit Tests | {n} |
| Missing Unit Tests | {n} |
| Coverage Target | {%} |
## 1. Unit Tests
### 1.1 {ClassName} Tests
| ID | Test Case | Status | Priority | Mocks Needed |
|----|-----------|--------|----------|--------------|
| UT-001 | {description} | [ ] | P0 | {dependencies} |
## 2. Integration Tests
| ID | Scenario | Status | Priority |
|----|----------|--------|----------|
| IT-001 | {description} | [ ] | P1 |
## 3. Fixtures & Mocks Required
| ID | Fixture/Mock | Purpose | Status |
|----|--------------|---------|--------|
| FIX-001 | {name} | {purpose} | [ ] |
## 4. Implementation Order
1. Phase 1: Fixtures
2. Phase 2: Core Unit Tests
3. Phase 3: Error Case Tests
4. Phase 4: Integration Tests
## 5. Out of Scope
| Test | Dependency | Reason |
|------|------------|--------|
| {test} | {external dep} | Cross-container |
Container Test Plan
markdown
# Test Plan - {Container Name} Integration
**Generated**: {timestamp}
**Container**: {container}
**Status**: {Not Started | In Progress | Complete}
## Container Responsibilities
From `.arch-registry/containers/{container}.md`:
| ID | Responsibility | Description |
|----|----------------|-------------|
| RESP-001 | {Responsibility 1} | {What this responsibility delivers} |
| RESP-002 | {Responsibility 2} | {What this responsibility delivers} |
## Summary
| Metric | Count |
|--------|-------|
| Container Responsibilities | {n} |
| P0 Responsibility Tests | {n} |
| P0 Flow Tests | {n} |
| P1 Business Logic Tests | {n} |
| P2 Error Handling Tests | {n} |
| P3 Unit/Infrastructure Tests | {n} |
## 1. Container Responsibility Tests (P0)
**Purpose**: Validate the container fulfills each assigned responsibility end-to-end.
| ID | Responsibility | Test Case | Input | Expected Output | Status |
|----|----------------|-----------|-------|-----------------|--------|
| RT-001 | RESP-001 | {Test description} | {Sample input} | {Expected output} | [ ] |
**Black Box Guarantee**: When these tests pass, external testers can trust the container fulfills its responsibilities.
## 2. Input-to-Output Flow Tests (P0)
**Purpose**: Test complete data flow from entry point to expected output.
| ID | Entry Point | Flow Description | Input | Expected Output | Status |
|----|-------------|------------------|-------|-----------------|--------|
| FT-001 | {API endpoint/handler} | {Complete flow description} | {Input data} | {Output data} | [ ] |
## 3. Business Logic Integration Tests (P1)
| ID | Business Logic | Test Case | Components Involved | Status |
|----|----------------|-----------|---------------------|--------|
| BL-001 | {Business rule} | {Test description} | {list} | [ ] |
## 4. Error Handling Tests (P2)
| ID | Error Scenario | Expected Behavior | Status |
|----|----------------|-------------------|--------|
| EH-001 | {Invalid input} | {Error response} | [ ] |
## 5. Infrastructure Tests (P3)
| ID | Test Case | Status |
|----|-----------|--------|
| INF-001 | Container starts successfully | [ ] |
| INF-002 | Health endpoint responds 200 | [ ] |
## 6. Unit Tests (P3 - Optional)
**Note**: Unit tests can be OMITTED if the same logic is covered by P0/P1 integration tests.
| ID | Class | Test Case | Covered By Integration? | Status |
|----|-------|-----------|------------------------|--------|
| UT-001 | {Class} | {Test} | Yes - RT-001 | [SKIP] |
| UT-002 | {Class} | {Test} | No | [ ] |
## 7. Test Infrastructure
| ID | Fixture/Mock | Purpose | Status |
|----|--------------|---------|--------|
| FIX-001 | mock_{external_service} | Mock external API | [ ] |
## Black Box Guarantee
**When all P0 tests pass, the following is guaranteed**:
- ✓ Container fulfills ALL assigned responsibilities
- ✓ Data flows correctly from input to expected output
- ✓ External testers can trust the container works without worrying about internal integration
Priority Levels
| Level | Meaning | Test Types | Business Impact |
|---|---|---|---|
| P0 | Critical - validates container responsibilities | Responsibility tests, Flow tests | Container trust depends on these |
| P1 | High - critical business logic | Business logic integration, Critical path | Core functionality coverage |
| P2 | Normal - error scenarios | Error handling, Edge cases | Graceful failure handling |
| P3 | Low - can be omitted | Unit tests, Infrastructure | Often redundant with P0/P1 |
P0 Tests - Black Box Guarantee
P0 tests are special:
- •They validate the container fulfills its assigned responsibilities
- •They test complete flows from input to output
- •When P0 tests pass, external testers can TRUST the container works
- •P0 failures are CRITICAL - container may not be fulfilling its purpose
Unit Tests at Container Level
Important: Unit tests (P3) can be OMITTED at container level if:
- •The same logic is covered by P0 responsibility tests
- •The same logic is covered by P0 flow tests
- •The same logic is covered by P1 integration tests
Mark as [SKIP] with reference to the covering integration test.
Test Implementation Patterns
P0: Responsibility Test Pattern (HIGHEST PRIORITY)
code
// Generic pattern - adapt to language
// Tests that the container fulfills a specific responsibility
// Responsibility: "Process incoming orders and produce shipping labels"
// Setup - configure container with mocked external dependencies
container = TestContainer()
container.mock_database(InMemoryDatabase())
container.mock_external_api(MockShippingProvider())
container.wire_components()
// Input - data enters the container
input = create_valid_order(items=[...], address={...})
// Execute - exercise the COMPLETE flow through the container
result = container.process_order(input)
// Verify - the RESPONSIBILITY is fulfilled
assert result.shipping_label is not None
assert result.status == "ready_for_shipment"
// BLACK BOX: Internal implementation details don't matter
// Only the outcome (responsibility fulfilled) matters
P0: Input-to-Output Flow Test Pattern
code
// Generic pattern - adapt to language
// Tests complete data path from entry point to output
// Setup - container with mocked external dependencies only
container = TestContainer()
container.mock_external_services()
container.wire_components()
// Input - data enters at the container's entry point
request = create_api_request(
endpoint="/api/orders",
method="POST",
body={"items": [...], "customer_id": "123"}
)
// Execute - follow complete flow through container
response = container.handle_request(request)
// Verify OUTPUT - expected result from the flow
assert response.status_code == 201
assert response.body["order_id"] is not None
// Verify SIDE EFFECTS - expected changes
assert container.database.orders.count() == 1
P3: Arrange-Act-Assert Pattern (Unit Tests)
code
// Arrange - Set up test conditions dependency_mock = create_mock() subject = ClassUnderTest(dependency_mock) input_data = valid_input() // Act - Execute the behavior result = subject.method(input_data) // Assert - Verify the outcome assert result == expected dependency_mock.verify_called()
Fixture Patterns
python
# Python pytest fixtures
@pytest.fixture
def mock_repository():
"""Mock repository with test data."""
mock = Mock()
mock.find.return_value = {"id": "1", "name": "test"}
return mock
@pytest.fixture
def service(mock_repository):
"""Service under test."""
return MyService(repository=mock_repository)
Mock Patterns
python
# Mock external dependencies
class MockDatabase:
def __init__(self, data=None):
self._data = data or {}
def find(self, id):
return self._data.get(id)
def save(self, entity):
self._data[entity["id"]] = entity
Mocking Requirements
Always Mock
| Dependency | Mock Strategy |
|---|---|
| Database | In-memory dict or mock |
| HTTP clients | Predefined responses |
| Message queues | In-memory queue |
| External APIs | Stub implementations |
| File system | Temp directory |
| Time/dates | Fixed values |
Never Call Real
| Resource | Why |
|---|---|
| Real database | External service |
| Real HTTP endpoints | Network dependency |
| Other containers | Cross-container |
| Cloud services | External service |
Error Classification for Fixing
Syntax-Level (Fixable)
| Error | Example | Fix |
|---|---|---|
| Import | ModuleNotFoundError | Fix path |
| Syntax | SyntaxError | Fix code |
| Mock | AttributeError: Mock has no... | Add mock |
| Fixture | fixture not found | Create fixture |
| Type | TypeError: wrong type | Fix types |
Logic-Level (Report Only)
| Error | Example | Action | Priority Impact |
|---|---|---|---|
| Wrong value | assert 3 == 5 | Report | P0 = CRITICAL |
| Missing impl | no attribute 'method' | Report | P0 = CRITICAL |
| Spec mismatch | Behavior differs | Report | P0 = CRITICAL |
P0 Logic Errors are CRITICAL
If a P0 test (Responsibility or Flow test) has a logic error:
- •Mark as CRITICAL - container may not be fulfilling its responsibilities
- •External testers DEPEND on P0 tests passing to trust the container
- •Fix P0 logic errors BEFORE proceeding with other tests
Test Directory Structure
Component
code
{container}/{component}/
└── tests/
├── conftest.{ext} # Shared fixtures
├── mocks/
│ └── mock_{dep}.{ext} # Mock implementations
├── integration/ # P1 - Business logic integration tests
│ └── test_{scenario}.{ext}
├── errors/ # P2 - Error handling tests
│ └── test_{error_case}.{ext}
└── unit/ # P3 - Unit tests (if not covered by integration)
└── test_{class}.{ext}
Container
code
{container}/
└── tests/
├── conftest.{ext} # Shared fixtures
├── mocks/
│ └── mock_{external}.{ext} # Mock external services
├── responsibility/ # P0 - Container responsibility tests
│ └── test_{responsibility}.{ext}
├── flows/ # P0 - Input-to-output flow tests
│ └── test_{flow}.{ext}
├── integration/ # P1 - Business logic integration tests
│ └── test_{scenario}.{ext}
├── errors/ # P2 - Error handling tests
│ └── test_{error_case}.{ext}
├── unit/ # P3 - Unit tests (OPTIONAL)
│ └── test_{class}.{ext}
└── infrastructure/ # P3 - Infrastructure tests
├── test_startup.{ext}
└── test_health.{ext}
Test Naming Conventions
Files
| Type | Pattern |
|---|---|
| Unit tests | test_{class_name}.{ext} |
| Integration | test_{scenario}.{ext} |
| Fixtures | conftest.{ext} |
| Mocks | mock_{dependency}.{ext} |
Functions
code
# Pattern: test_{what}_{condition}_{expected}
test_process_valid_input_returns_success
test_process_invalid_input_raises_error
# Or with ID: test_{id}_{description}
test_ut001_valid_input_success
Integration with Main Pipeline
code
/architect → Specs created
↓
/plan → Implementation plans
↓
/develop → Code implemented
↓
/plan-test-setup → Test plans created (THIS PIPELINE)
↓
/implement-test-setup → Tests implemented
↓
/fix-test-setup → Tests fixed
↓
/e2e-test → Cross-container tests (separate)
Quality Checklist
Test Plan Quality
- • All spec test cases have action items
- • Existing tests marked
[x] - • Missing tests marked
[ ] - • Fixtures identified
- • Out of scope clearly marked
- • Implementation order defined
Test Implementation Quality
- • Arrange-Act-Assert pattern
- • Clear test names
- • Spec IDs referenced
- • Dependencies mocked
- • Tests independent
- • Clean up after tests
Test Fix Quality
- • Minimal changes only
- • No logic changes
- • No expected value changes
- • Verified after fix
- • Logic errors reported
Common Anti-Patterns
| Anti-Pattern | Problem | Solution |
|---|---|---|
| Real DB calls | External dependency | Mock database |
| Real HTTP | Network dependency | Mock HTTP client |
| Cross-container | Out of scope | Mark out of scope |
| Changing expected values | Masks bugs | Report instead |
| Order-dependent tests | Flaky | Make independent |
| No assertions | Tests don't verify | Always assert |
| Missing responsibility tests | Container trust broken | P0 test for each responsibility |
| Unit tests without checking coverage | Redundant tests | Mark as [SKIP] if covered by integration |
| Treating all failures equally | Missing business impact | P0 failures are CRITICAL |
| Fixing P3 before P0 | Wrong priorities | Always fix P0 first |
| Not reading .arch-registry first | Missing responsibilities | Always read container responsibilities first |