Test Engineer Skill
AI-powered testing guidance for designing, implementing, and maintaining comprehensive test suites with focus on reliability, coverage, and test-driven development principles.
What This Skill Does
This skill provides expert-level testing guidance including test strategy, test architecture, mock/stub design, coverage analysis, and debugging failing tests. It combines software testing best practices with practical, actionable recommendations.
Key Capabilities:
- •Test Strategy: Test pyramid design, coverage goals, testing philosophies
- •Unit Testing: Isolation patterns, mocking, assertions, fixtures
- •Integration Testing: Component integration, API testing, database testing
- •End-to-End Testing: User flows, browser automation, visual testing
- •Test Automation: CI/CD integration, parallel execution, flaky test detection
- •TDD/BDD: Test-first development, behavior specifications, acceptance criteria
Core Principles
The Testing Pyramid
┌───────────┐
│ E2E │ ← Few, slow, expensive
├───────────┤
│Integration│ ← Some, medium speed
├───────────┤
│ Unit │ ← Many, fast, cheap
└───────────┘
The FIRST Principles
- •Fast: Tests should run quickly
- •Isolated: Tests don't depend on each other
- •Repeatable: Same result every time
- •Self-Validating: Pass or fail, no interpretation needed
- •Timely: Written at the right time (ideally before code)
Test Quality Attributes
- •Clarity - Easy to understand what's being tested
- •Maintainability - Easy to update when code changes
- •Reliability - No flaky tests, consistent results
- •Speed - Fast feedback loops
- •Coverage - Adequate coverage of critical paths
Test Assessment Workflow
1. Test Suite Analysis
Analyze the current test suite: ├── Structure (test organization, naming) ├── Coverage (line, branch, path coverage) ├── Speed (execution time distribution) ├── Reliability (flaky test detection) └── Gaps (untested critical paths)
2. Test Health Metrics
- •Coverage Score: What percentage of code is tested?
- •Test-to-Code Ratio: How many test lines per code line?
- •Execution Time: How long does the full suite take?
- •Flakiness Rate: How often do tests fail randomly?
- •Mutation Score: How many mutants are killed?
3. Recommendations
Generate prioritized recommendations based on:
- •Risk (what's most important to test)
- •Coverage Gaps (what's not tested)
- •Reliability (what's flaky or slow)
- •Maintainability (what's hard to maintain)
Testing Patterns by Type
Unit Testing
# Arrange-Act-Assert (AAA) Pattern
def test_calculate_discount():
# Arrange
cart = Cart(items=[Item(price=100)])
discount = PercentageDiscount(10)
# Act
total = cart.apply_discount(discount)
# Assert
assert total == 90
Use When: Testing individual functions, classes, or methods in isolation
Integration Testing
# Test component interactions
def test_user_registration_flow():
# Test that UserService correctly interacts with
# Database, EmailService, and Validator
user_service = UserService(db, email_svc, validator)
result = user_service.register("test@example.com", "password")
assert result.success
assert db.find_user("test@example.com") is not None
assert email_svc.sent_emails[-1].to == "test@example.com"
Use When: Testing how components work together
End-to-End Testing
# Test full user flows
def test_checkout_flow(browser):
browser.goto("/products")
browser.click("#add-to-cart-btn")
browser.goto("/cart")
browser.click("#checkout-btn")
browser.fill("#email", "user@test.com")
browser.click("#submit-order")
assert browser.url == "/order-confirmation"
assert "Order placed" in browser.text
Use When: Validating complete user journeys
API Testing
# Test REST/GraphQL endpoints
def test_create_user_endpoint(client):
response = client.post("/api/users", json={
"email": "new@test.com",
"name": "Test User"
})
assert response.status_code == 201
assert response.json["id"] is not None
assert response.json["email"] == "new@test.com"
Use When: Testing HTTP APIs and web services
Mocking Strategies
When to Mock
| Mock | Don't Mock |
|---|---|
| External APIs | Core business logic |
| Databases (in unit tests) | Simple value objects |
| File systems | Pure functions |
| Time/randomness | In-memory implementations |
| Email/SMS services | The system under test |
Mocking Patterns
# Spy - Track calls without changing behavior
from unittest.mock import MagicMock
def test_notification_sent():
notifier = MagicMock()
service = UserService(notifier=notifier)
service.create_user("test@example.com")
notifier.send.assert_called_once_with(
to="test@example.com",
subject="Welcome!"
)
# Stub - Provide canned responses
def test_fetch_user_data():
api_client = MagicMock()
api_client.get.return_value = {"name": "John", "age": 30}
service = DataService(api=api_client)
result = service.get_user_profile(123)
assert result.name == "John"
# Fake - In-memory implementation
class FakeDatabase:
def __init__(self):
self.data = {}
def save(self, key, value):
self.data[key] = value
def find(self, key):
return self.data.get(key)
def test_user_persistence():
fake_db = FakeDatabase()
repo = UserRepository(db=fake_db)
repo.save(User(id=1, name="Test"))
assert repo.find(1).name == "Test"
Test Data Management
Fixture Patterns
# Pytest fixtures for reusable test data
import pytest
@pytest.fixture
def sample_user():
return User(id=1, email="test@example.com", name="Test User")
@pytest.fixture
def authenticated_client(sample_user):
client = TestClient()
client.login(sample_user)
return client
def test_user_profile(authenticated_client, sample_user):
response = authenticated_client.get("/profile")
assert response.json["email"] == sample_user.email
Factory Pattern
# Generate test data dynamically
class UserFactory:
@staticmethod
def create(**overrides):
defaults = {
"id": uuid4(),
"email": f"user-{uuid4()}@test.com",
"name": "Test User",
"active": True
}
return User(**{**defaults, **overrides})
def test_inactive_users():
inactive_user = UserFactory.create(active=False)
assert not inactive_user.can_login()
Builder Pattern
# Fluent interface for complex test data
class OrderBuilder:
def __init__(self):
self.order = Order()
def with_item(self, name, price):
self.order.items.append(Item(name, price))
return self
def with_discount(self, percent):
self.order.discount = percent
return self
def build(self):
return self.order
def test_order_total():
order = (OrderBuilder()
.with_item("Widget", 100)
.with_item("Gadget", 50)
.with_discount(10)
.build())
assert order.total == 135 # (100 + 50) * 0.9
Common Testing Anti-Patterns
| Anti-Pattern | Symptoms | Solution |
|---|---|---|
| The Giant | Test with 100+ lines | Split into focused tests |
| The Mockery | Mock everything | Only mock external deps |
| The Sleeper | Uses sleep() for timing | Use proper async waiting |
| The Inspector | Tests implementation details | Test behavior, not internals |
| The Flickering | Randomly passes/fails | Fix race conditions, isolate state |
| The Secret Catcher | Catches all exceptions | Assert specific exceptions |
| The Slow Poke | Takes minutes to run | Optimize or move to integration |
| The Dodger | Skipped and ignored | Fix or remove dead tests |
Test Coverage Guidelines
Coverage Types
Line Coverage: Which lines executed? Branch Coverage: Which branches taken? Path Coverage: Which execution paths? Mutation Testing: Are assertions meaningful?
Coverage Targets
Critical Business Logic: 90%+ Core Services: 80%+ Utilities/Helpers: 70%+ UI Components: 60%+ Generated Code: 0% (exclude from coverage)
Coverage Commands
# Python coverage pytest --cov=src --cov-report=html coverage run -m pytest && coverage report # JavaScript/TypeScript npm test -- --coverage npx jest --collectCoverageFrom='src/**/*.ts' # Go go test -coverprofile=coverage.out ./... go tool cover -html=coverage.out
CI/CD Integration
Test Pipeline Structure
# Example GitHub Actions workflow
name: Tests
on: [push, pull_request]
jobs:
unit-tests:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run unit tests
run: npm run test:unit
integration-tests:
runs-on: ubuntu-latest
needs: unit-tests
services:
postgres:
image: postgres:15
steps:
- uses: actions/checkout@v4
- name: Run integration tests
run: npm run test:integration
e2e-tests:
runs-on: ubuntu-latest
needs: integration-tests
steps:
- uses: actions/checkout@v4
- name: Run E2E tests
run: npm run test:e2e
Parallel Test Execution
# Python - pytest-xdist pytest -n auto # Auto-detect CPU count # JavaScript - Jest jest --maxWorkers=4 # Go go test -parallel 4 ./...
Test-Driven Development (TDD)
Red-Green-Refactor Cycle
┌─────────────────────────────────────────┐
│ 1. RED: Write a failing test │
├─────────────────────────────────────────┤
│ 2. GREEN: Write minimal code to pass │
├─────────────────────────────────────────┤
│ 3. REFACTOR: Clean up without breaking │
└─────────────────────────────────────────┘
↑ │
└──────────────────────────────────┘
TDD Example
# Step 1: RED - Write failing test
def test_email_validation():
validator = EmailValidator()
assert validator.is_valid("user@example.com") == True
assert validator.is_valid("invalid") == False
# Step 2: GREEN - Make it pass (minimal)
class EmailValidator:
def is_valid(self, email):
return "@" in email
# Step 3: REFACTOR - Improve
import re
class EmailValidator:
EMAIL_PATTERN = re.compile(r"^[^@]+@[^@]+\.[^@]+$")
def is_valid(self, email):
return bool(self.EMAIL_PATTERN.match(email))
When to Use This Skill
Trigger Phrases:
- •"How should I test..."
- •"What's the best way to mock..."
- •"My tests are flaky..."
- •"How do I increase coverage..."
- •"Write tests for..."
- •"Help me set up testing..."
- •"This test keeps failing..."
- •"Should I use unit or integration tests for..."
Example Requests:
- •"How should I test this async function?"
- •"My tests are taking 10 minutes, help me speed them up"
- •"How do I mock this external API?"
- •"Help me set up pytest for this project"
- •"Write unit tests for this class"
- •"How do I test error handling?"
Test Review Checklist
Before shipping code with tests:
- • Coverage adequate? Critical paths covered?
- • Tests isolated? No shared state between tests?
- • Fast enough? Unit tests under 100ms each?
- • Clear intent? Test name describes behavior?
- • No flakiness? Runs reliably 100 times?
- • Proper assertions? Testing the right things?
- • Edge cases covered? Null, empty, boundaries?
- • Error cases tested? Exception handling verified?
Framework Quick Reference
Python (pytest)
pip install pytest pytest-cov pytest-mock pytest -v --tb=short pytest --cov=src -x # Stop on first failure
JavaScript (Jest)
npm install --save-dev jest npx jest --watch # Watch mode npx jest --coverage
TypeScript (Vitest)
npm install --save-dev vitest npx vitest run npx vitest --coverage
Go
go test ./... go test -v -race ./... # With race detection go test -cover ./...
Integration with Other Skills
- •Architect: Test strategy follows architecture
- •Code Review: Test coverage in review process
- •Performance: Load testing and benchmarks
- •Debugging: Test failures guide debugging
Skill designed for Thanos + Antigravity integration