Test Planning Skill

Guidelines for planning, implementing, and fixing tests at component and container level within container boundaries.

Overview

This skill provides guidance for the test-focused pipeline:

•/plan-test-setup → Analyze specs, plan tests
•/implement-test-setup → Implement planned tests
•/fix-test-setup → Fix failing tests

All tests are scoped to container boundaries only - no cross-container or external service tests.

Test Prioritization Philosophy

CRITICAL: Tests are prioritized by their importance to validate business logic fulfillment:

Priority Hierarchy (Highest to Lowest)

Priority	Test Type	Purpose
P0	Container Responsibility Tests	Validate container fulfills its assigned responsibilities from input to expected output
P0	End-to-End Flow Tests	Test complete data paths through the container (entry point → processing → output)
P1	Business Logic Integration Tests	Tests covering large chunks of business functionality aligned with implementation
P1	Critical Path Tests	Core business logic that must work for the container to be useful
P2	Error Handling Tests	Graceful failure handling and error propagation
P3	Component Unit Tests	Individual class tests (can be OMITTED at container level when covered by integration)

Design Principles

•Black Box Validation: Tests should validate the container works as a black box - external testers should NOT worry about internal integration issues
•Input-to-Output Testing: Prioritize tests that exercise the complete flow from data entry to expected output
•Business Logic First: Tests should be structured around business responsibilities, not technical components
•Integration Over Unit: At container level, integration tests that cover multiple components are more valuable than isolated unit tests
•Unit Tests Optional: Unit tests can be omitted at container level if the same logic is covered by integration tests

Core Principle: Container Boundary Scope

All tests must be runnable within a single container without external dependencies.

In Scope (Prioritized)

Priority	Test Type	Description	Allowed Dependencies
P0	Container Responsibility Tests	Validates each container responsibility end-to-end	Mocked external services
P0	Input-to-Output Flow Tests	Complete data path from entry to output	Mocked external services
P1	Business Logic Integration	Tests across components within container	Same container only
P1	Critical Path Tests	Core functionality tests	Same container only
P2	Error Handling Tests	Graceful failure handling	Same container only
P3	Unit Tests	Individual class/function tests	Mocked internal deps only
P3	Infrastructure Tests	Startup, health checks	Container only

Out of Scope (→ /e2e-test)

Test Type	Why Out of Scope
Cross-Container	Requires other containers
Database Integration	Requires real database
External API	Requires live services
E2E Tests	Requires full system

Test ID Conventions

Use consistent IDs for traceability:

Type	Format	Example	Priority	Used In
Responsibility Test	`RT-{nnn}`	RT-001	P0	Container responsibility validation
Flow Test	`FT-{nnn}`	FT-001	P0	Input-to-output flow tests
Business Logic Test	`BL-{nnn}`	BL-001	P1	Business logic integration
Error Handling Test	`EH-{nnn}`	EH-001	P2	Error scenario tests
Unit Test	`UT-{nnn}`	UT-001	P3	Component tests
Infrastructure Test	`INF-{nnn}`	INF-001	P3	Startup, health checks
Fixture	`FIX-{nnn}`	FIX-001	-	Test infrastructure
Integration Test (legacy)	`IT-{nnn}`	IT-001	P1	Component integration
Container Integration (legacy)	`CIT-{nnn}`	CIT-001	varies	Container tests
External Integration	`EIT-{nnn}`	EIT-001	-	Out of scope marker

Test Plan Structure (`.test-plan.md`)

Component Test Plan

markdown

# Test Plan - {Component Name}

**Generated**: {timestamp}
**Component**: {container}/{component}
**Status**: {Not Started | In Progress | Complete}

## Summary

| Metric | Count |
|--------|-------|
| Specified Unit Tests | {n} |
| Existing Unit Tests | {n} |
| Missing Unit Tests | {n} |
| Coverage Target | {%} |

## 1. Unit Tests

### 1.1 {ClassName} Tests

| ID | Test Case | Status | Priority | Mocks Needed |
|----|-----------|--------|----------|--------------|
| UT-001 | {description} | [ ] | P0 | {dependencies} |

## 2. Integration Tests

| ID | Scenario | Status | Priority |
|----|----------|--------|----------|
| IT-001 | {description} | [ ] | P1 |

## 3. Fixtures & Mocks Required

| ID | Fixture/Mock | Purpose | Status |
|----|--------------|---------|--------|
| FIX-001 | {name} | {purpose} | [ ] |

## 4. Implementation Order

1. Phase 1: Fixtures
2. Phase 2: Core Unit Tests
3. Phase 3: Error Case Tests
4. Phase 4: Integration Tests

## 5. Out of Scope

| Test | Dependency | Reason |
|------|------------|--------|
| {test} | {external dep} | Cross-container |

Container Test Plan

markdown

# Test Plan - {Container Name} Integration

**Generated**: {timestamp}
**Container**: {container}
**Status**: {Not Started | In Progress | Complete}

## Container Responsibilities

From `.arch-registry/containers/{container}.md`:

| ID | Responsibility | Description |
|----|----------------|-------------|
| RESP-001 | {Responsibility 1} | {What this responsibility delivers} |
| RESP-002 | {Responsibility 2} | {What this responsibility delivers} |

## Summary

| Metric | Count |
|--------|-------|
| Container Responsibilities | {n} |
| P0 Responsibility Tests | {n} |
| P0 Flow Tests | {n} |
| P1 Business Logic Tests | {n} |
| P2 Error Handling Tests | {n} |
| P3 Unit/Infrastructure Tests | {n} |

## 1. Container Responsibility Tests (P0)

**Purpose**: Validate the container fulfills each assigned responsibility end-to-end.

| ID | Responsibility | Test Case | Input | Expected Output | Status |
|----|----------------|-----------|-------|-----------------|--------|
| RT-001 | RESP-001 | {Test description} | {Sample input} | {Expected output} | [ ] |

**Black Box Guarantee**: When these tests pass, external testers can trust the container fulfills its responsibilities.

## 2. Input-to-Output Flow Tests (P0)

**Purpose**: Test complete data flow from entry point to expected output.

| ID | Entry Point | Flow Description | Input | Expected Output | Status |
|----|-------------|------------------|-------|-----------------|--------|
| FT-001 | {API endpoint/handler} | {Complete flow description} | {Input data} | {Output data} | [ ] |

## 3. Business Logic Integration Tests (P1)

| ID | Business Logic | Test Case | Components Involved | Status |
|----|----------------|-----------|---------------------|--------|
| BL-001 | {Business rule} | {Test description} | {list} | [ ] |

## 4. Error Handling Tests (P2)

| ID | Error Scenario | Expected Behavior | Status |
|----|----------------|-------------------|--------|
| EH-001 | {Invalid input} | {Error response} | [ ] |

## 5. Infrastructure Tests (P3)

| ID | Test Case | Status |
|----|-----------|--------|
| INF-001 | Container starts successfully | [ ] |
| INF-002 | Health endpoint responds 200 | [ ] |

## 6. Unit Tests (P3 - Optional)

**Note**: Unit tests can be OMITTED if the same logic is covered by P0/P1 integration tests.

| ID | Class | Test Case | Covered By Integration? | Status |
|----|-------|-----------|------------------------|--------|
| UT-001 | {Class} | {Test} | Yes - RT-001 | [SKIP] |
| UT-002 | {Class} | {Test} | No | [ ] |

## 7. Test Infrastructure

| ID | Fixture/Mock | Purpose | Status |
|----|--------------|---------|--------|
| FIX-001 | mock_{external_service} | Mock external API | [ ] |

## Black Box Guarantee

**When all P0 tests pass, the following is guaranteed**:
- ✓ Container fulfills ALL assigned responsibilities
- ✓ Data flows correctly from input to expected output
- ✓ External testers can trust the container works without worrying about internal integration

Priority Levels

Level	Meaning	Test Types	Business Impact
P0	Critical - validates container responsibilities	Responsibility tests, Flow tests	Container trust depends on these
P1	High - critical business logic	Business logic integration, Critical path	Core functionality coverage
P2	Normal - error scenarios	Error handling, Edge cases	Graceful failure handling
P3	Low - can be omitted	Unit tests, Infrastructure	Often redundant with P0/P1

P0 Tests - Black Box Guarantee

P0 tests are special:

•They validate the container fulfills its assigned responsibilities
•They test complete flows from input to output
•When P0 tests pass, external testers can TRUST the container works
•P0 failures are CRITICAL - container may not be fulfilling its purpose

Unit Tests at Container Level

Important: Unit tests (P3) can be OMITTED at container level if:

•The same logic is covered by P0 responsibility tests
•The same logic is covered by P0 flow tests
•The same logic is covered by P1 integration tests

Mark as [SKIP] with reference to the covering integration test.

Test Implementation Patterns

P0: Responsibility Test Pattern (HIGHEST PRIORITY)

code

// Generic pattern - adapt to language
// Tests that the container fulfills a specific responsibility

// Responsibility: "Process incoming orders and produce shipping labels"

// Setup - configure container with mocked external dependencies
container = TestContainer()
container.mock_database(InMemoryDatabase())
container.mock_external_api(MockShippingProvider())
container.wire_components()

// Input - data enters the container
input = create_valid_order(items=[...], address={...})

// Execute - exercise the COMPLETE flow through the container
result = container.process_order(input)

// Verify - the RESPONSIBILITY is fulfilled
assert result.shipping_label is not None
assert result.status == "ready_for_shipment"

// BLACK BOX: Internal implementation details don't matter
// Only the outcome (responsibility fulfilled) matters

P0: Input-to-Output Flow Test Pattern

code

// Generic pattern - adapt to language
// Tests complete data path from entry point to output

// Setup - container with mocked external dependencies only
container = TestContainer()
container.mock_external_services()
container.wire_components()

// Input - data enters at the container's entry point
request = create_api_request(
    endpoint="/api/orders",
    method="POST",
    body={"items": [...], "customer_id": "123"}
)

// Execute - follow complete flow through container
response = container.handle_request(request)

// Verify OUTPUT - expected result from the flow
assert response.status_code == 201
assert response.body["order_id"] is not None

// Verify SIDE EFFECTS - expected changes
assert container.database.orders.count() == 1

P3: Arrange-Act-Assert Pattern (Unit Tests)

code

// Arrange - Set up test conditions
dependency_mock = create_mock()
subject = ClassUnderTest(dependency_mock)
input_data = valid_input()

// Act - Execute the behavior
result = subject.method(input_data)

// Assert - Verify the outcome
assert result == expected
dependency_mock.verify_called()

Fixture Patterns

python

# Python pytest fixtures
@pytest.fixture
def mock_repository():
    """Mock repository with test data."""
    mock = Mock()
    mock.find.return_value = {"id": "1", "name": "test"}
    return mock

@pytest.fixture
def service(mock_repository):
    """Service under test."""
    return MyService(repository=mock_repository)

Mock Patterns

python

# Mock external dependencies
class MockDatabase:
    def __init__(self, data=None):
        self._data = data or {}

    def find(self, id):
        return self._data.get(id)

    def save(self, entity):
        self._data[entity["id"]] = entity

Mocking Requirements

Always Mock

Dependency	Mock Strategy
Database	In-memory dict or mock
HTTP clients	Predefined responses
Message queues	In-memory queue
External APIs	Stub implementations
File system	Temp directory
Time/dates	Fixed values

Never Call Real

Resource	Why
Real database	External service
Real HTTP endpoints	Network dependency
Other containers	Cross-container
Cloud services	External service

Error Classification for Fixing

Syntax-Level (Fixable)

Error	Example	Fix
Import	`ModuleNotFoundError`	Fix path
Syntax	`SyntaxError`	Fix code
Mock	`AttributeError: Mock has no...`	Add mock
Fixture	`fixture not found`	Create fixture
Type	`TypeError: wrong type`	Fix types

Logic-Level (Report Only)

Error	Example	Action	Priority Impact
Wrong value	`assert 3 == 5`	Report	P0 = CRITICAL
Missing impl	`no attribute 'method'`	Report	P0 = CRITICAL
Spec mismatch	Behavior differs	Report	P0 = CRITICAL

P0 Logic Errors are CRITICAL

If a P0 test (Responsibility or Flow test) has a logic error:

•Mark as CRITICAL - container may not be fulfilling its responsibilities
•External testers DEPEND on P0 tests passing to trust the container
•Fix P0 logic errors BEFORE proceeding with other tests

Test Directory Structure

Component

code

{container}/{component}/
└── tests/
    ├── conftest.{ext}           # Shared fixtures
    ├── mocks/
    │   └── mock_{dep}.{ext}     # Mock implementations
    ├── integration/             # P1 - Business logic integration tests
    │   └── test_{scenario}.{ext}
    ├── errors/                  # P2 - Error handling tests
    │   └── test_{error_case}.{ext}
    └── unit/                    # P3 - Unit tests (if not covered by integration)
        └── test_{class}.{ext}

Container

code

{container}/
└── tests/
    ├── conftest.{ext}           # Shared fixtures
    ├── mocks/
    │   └── mock_{external}.{ext} # Mock external services
    ├── responsibility/          # P0 - Container responsibility tests
    │   └── test_{responsibility}.{ext}
    ├── flows/                   # P0 - Input-to-output flow tests
    │   └── test_{flow}.{ext}
    ├── integration/             # P1 - Business logic integration tests
    │   └── test_{scenario}.{ext}
    ├── errors/                  # P2 - Error handling tests
    │   └── test_{error_case}.{ext}
    ├── unit/                    # P3 - Unit tests (OPTIONAL)
    │   └── test_{class}.{ext}
    └── infrastructure/          # P3 - Infrastructure tests
        ├── test_startup.{ext}
        └── test_health.{ext}

Test Naming Conventions

Files

Type	Pattern
Unit tests	`test_{class_name}.{ext}`
Integration	`test_{scenario}.{ext}`
Fixtures	`conftest.{ext}`
Mocks	`mock_{dependency}.{ext}`

Functions

code

# Pattern: test_{what}_{condition}_{expected}
test_process_valid_input_returns_success
test_process_invalid_input_raises_error

# Or with ID: test_{id}_{description}
test_ut001_valid_input_success

Integration with Main Pipeline

code

/architect              → Specs created
      ↓
/plan                   → Implementation plans
      ↓
/develop                → Code implemented
      ↓
/plan-test-setup        → Test plans created (THIS PIPELINE)
      ↓
/implement-test-setup   → Tests implemented
      ↓
/fix-test-setup         → Tests fixed
      ↓
/e2e-test               → Cross-container tests (separate)

Quality Checklist

Test Plan Quality

• All spec test cases have action items
• Existing tests marked [x]
• Missing tests marked [ ]
• Fixtures identified
• Out of scope clearly marked
• Implementation order defined

Test Implementation Quality

Test Fix Quality

Common Anti-Patterns

Anti-Pattern	Problem	Solution
Real DB calls	External dependency	Mock database
Real HTTP	Network dependency	Mock HTTP client
Cross-container	Out of scope	Mark out of scope
Changing expected values	Masks bugs	Report instead
Order-dependent tests	Flaky	Make independent
No assertions	Tests don't verify	Always assert
Missing responsibility tests	Container trust broken	P0 test for each responsibility
Unit tests without checking coverage	Redundant tests	Mark as [SKIP] if covered by integration
Treating all failures equally	Missing business impact	P0 failures are CRITICAL
Fixing P3 before P0	Wrong priorities	Always fix P0 first
Not reading .arch-registry first	Missing responsibilities	Always read container responsibilities first

Test Planning Skill

Overview

Test Prioritization Philosophy

Priority Hierarchy (Highest to Lowest)

Design Principles

Core Principle: Container Boundary Scope

In Scope (Prioritized)

Out of Scope (→ /e2e-test)

Test ID Conventions

Test Plan Structure (.test-plan.md)

Component Test Plan

Container Test Plan

Priority Levels

P0 Tests - Black Box Guarantee

Unit Tests at Container Level

Test Implementation Patterns

P0: Responsibility Test Pattern (HIGHEST PRIORITY)

P0: Input-to-Output Flow Test Pattern

P3: Arrange-Act-Assert Pattern (Unit Tests)

Fixture Patterns

Mock Patterns

Mocking Requirements

Always Mock

Never Call Real

Error Classification for Fixing

Syntax-Level (Fixable)

Logic-Level (Report Only)

P0 Logic Errors are CRITICAL

Test Directory Structure

Component

Container

Test Naming Conventions

Files

Functions

Integration with Main Pipeline

Quality Checklist

Test Plan Quality

Test Implementation Quality

Test Fix Quality

Common Anti-Patterns

Test Plan Structure (`.test-plan.md`)