Error Handling Patterns
Build resilient applications with robust error handling strategies that gracefully handle failures and provide excellent debugging experiences.
When to Use This Skill
- •Implementing error handling in new features
- •Designing error-resilient APIs
- •Debugging production issues
- •Improving application reliability
- •Creating better error messages for users and developers
- •Implementing retry and circuit breaker patterns
- •Handling async/concurrent errors
- •Building fault-tolerant distributed systems
Core Concepts
1. Error Handling Philosophies
Exceptions vs Result Types:
- •Exceptions: Traditional try-catch, disrupts control flow
- •Result Types: Explicit success/failure, functional approach
- •Error Codes: C-style, requires discipline
- •Option/Maybe Types: For nullable values
When to Use Each:
- •Exceptions: Unexpected errors, exceptional conditions
- •Result Types: Expected errors, validation failures
- •Panics/Crashes: Unrecoverable errors, programming bugs
2. Error Categories
Recoverable Errors:
- •Network timeouts
- •Missing files
- •Invalid user input
- •API rate limits
Unrecoverable Errors:
- •Out of memory
- •Stack overflow
- •Programming bugs (null pointer, etc.)
Language-Specific Patterns
For detailed code examples in Python, TypeScript, Rust, and Go, see: 👉 examples/language-patterns.md
Universal Patterns
Pattern 1: Circuit Breaker
Prevent cascading failures in distributed systems.
python
from enum import Enum
from datetime import datetime, timedelta
from typing import Callable, TypeVar
T = TypeVar('T')
class CircuitState(Enum):
CLOSED = "closed" # Normal operation
OPEN = "open" # Failing, reject requests
HALF_OPEN = "half_open" # Testing if recovered
class CircuitBreaker:
def __init__(
self,
failure_threshold: int = 5,
timeout: timedelta = timedelta(seconds=60),
success_threshold: int = 2
):
self.failure_threshold = failure_threshold
self.timeout = timeout
self.success_threshold = success_threshold
self.failure_count = 0
self.success_count = 0
self.state = CircuitState.CLOSED
self.last_failure_time = None
def call(self, func: Callable[[], T]) -> T:
if self.state == CircuitState.OPEN:
if datetime.now() - self.last_failure_time > self.timeout:
self.state = CircuitState.HALF_OPEN
self.success_count = 0
else:
raise Exception("Circuit breaker is OPEN")
try:
result = func()
self.on_success()
return result
except Exception as e:
self.on_failure()
raise
def on_success(self):
self.failure_count = 0
if self.state == CircuitState.HALF_OPEN:
self.success_count += 1
if self.success_count >= self.success_threshold:
self.state = CircuitState.CLOSED
self.success_count = 0
def on_failure(self):
self.failure_count += 1
self.last_failure_time = datetime.now()
if self.failure_count >= self.failure_threshold:
self.state = CircuitState.OPEN
# Usage
circuit_breaker = CircuitBreaker()
def fetch_data():
return circuit_breaker.call(lambda: external_api.get_data())
Pattern 2: Error Aggregation
Collect multiple errors instead of failing on first error.
typescript
class ErrorCollector {
private errors: Error[] = [];
add(error: Error): void {
this.errors.push(error);
}
hasErrors(): boolean {
return this.errors.length > 0;
}
getErrors(): Error[] {
return [...this.errors];
}
throw(): never {
if (this.errors.length === 1) {
throw this.errors[0];
}
throw new AggregateError(
this.errors,
`${this.errors.length} errors occurred`,
);
}
}
// Usage: Validate multiple fields
function validateUser(data: any): User {
const errors = new ErrorCollector();
if (!data.email) {
errors.add(new ValidationError("Email is required"));
} else if (!isValidEmail(data.email)) {
errors.add(new ValidationError("Email is invalid"));
}
if (!data.name || data.name.length < 2) {
errors.add(new ValidationError("Name must be at least 2 characters"));
}
if (!data.age || data.age < 18) {
errors.add(new ValidationError("Age must be 18 or older"));
}
if (errors.hasErrors()) {
errors.throw();
}
return data as User;
}
Pattern 3: Graceful Degradation
Provide fallback functionality when errors occur.
python
from typing import Optional, Callable, TypeVar
T = TypeVar('T')
def with_fallback(
primary: Callable[[], T],
fallback: Callable[[], T],
log_error: bool = True
) -> T:
"""Try primary function, fall back to fallback on error."""
try:
return primary()
except Exception as e:
if log_error:
logger.error(f"Primary function failed: {e}")
return fallback()
# Usage
def get_user_profile(user_id: str) -> UserProfile:
return with_fallback(
primary=lambda: fetch_from_cache(user_id),
fallback=lambda: fetch_from_database(user_id)
)
# Multiple fallbacks
def get_exchange_rate(currency: str) -> float:
return (
try_function(lambda: api_provider_1.get_rate(currency))
or try_function(lambda: api_provider_2.get_rate(currency))
or try_function(lambda: cache.get_rate(currency))
or DEFAULT_RATE
)
def try_function(func: Callable[[], Optional[T]]) -> Optional[T]:
try:
return func()
except Exception:
return None
Best Practices
- •Fail Fast: Validate input early, fail quickly
- •Preserve Context: Include stack traces, metadata, timestamps
- •Meaningful Messages: Explain what happened and how to fix it
- •Log Appropriately: Error = log, expected failure = don't spam logs
- •Handle at Right Level: Catch where you can meaningfully handle
- •Clean Up Resources: Use try-finally, context managers, defer
- •Don't Swallow Errors: Log or re-throw, don't silently ignore
- •Type-Safe Errors: Use typed errors when possible
python
# Good error handling example
def process_order(order_id: str) -> Order:
"""Process order with comprehensive error handling."""
try:
# Validate input
if not order_id:
raise ValidationError("Order ID is required")
# Fetch order
order = db.get_order(order_id)
if not order:
raise NotFoundError("Order", order_id)
# Process payment
try:
payment_result = payment_service.charge(order.total)
except PaymentServiceError as e:
# Log and wrap external service error
logger.error(f"Payment failed for order {order_id}: {e}")
raise ExternalServiceError(
f"Payment processing failed",
service="payment_service",
details={"order_id": order_id, "amount": order.total}
) from e
# Update order
order.status = "completed"
order.payment_id = payment_result.id
db.save(order)
return order
except ApplicationError:
# Re-raise known application errors
raise
except Exception as e:
# Log unexpected errors
logger.exception(f"Unexpected error processing order {order_id}")
raise ApplicationError(
"Order processing failed",
code="INTERNAL_ERROR"
) from e
Common Pitfalls
- •Catching Too Broadly:
except Exceptionhides bugs - •Empty Catch Blocks: Silently swallowing errors
- •Logging and Re-throwing: Creates duplicate log entries
- •Not Cleaning Up: Forgetting to close files, connections
- •Poor Error Messages: "Error occurred" is not helpful
- •Returning Error Codes: Use exceptions or Result types
- •Ignoring Async Errors: Unhandled promise rejections
Resources
- •references/exception-hierarchy-design.md: Designing error class hierarchies
- •references/error-recovery-strategies.md: Recovery patterns for different scenarios
- •references/async-error-handling.md: Handling errors in concurrent code
- •assets/error-handling-checklist.md: Review checklist for error handling
- •assets/error-message-guide.md: Writing helpful error messages
- •scripts/error-analyzer.py: Analyze error patterns in logs