Load Balancing Patterns
Distribute traffic across infrastructure using the appropriate load balancing approach, from simple round-robin to global multi-region failover.
When to Use This Skill
Use load-balancing-patterns when:
- •Distributing traffic across multiple application servers
- •Implementing high availability and failover
- •Routing traffic based on URLs, headers, or geographic location
- •Managing session persistence across stateless backends
- •Deploying applications to Kubernetes clusters
- •Configuring global traffic management across regions
- •Implementing zero-downtime deployments (blue-green, canary)
- •Selecting between cloud-managed and self-managed load balancers
Core Load Balancing Concepts
Layer 4 vs Layer 7
Layer 4 (L4) - Transport Layer:
- •Routes based on IP address and port (TCP/UDP packets)
- •No application data inspection, lower latency, higher throughput
- •Protocol agnostic, preserves client IP addresses
- •Use for: Database connections, video streaming, gaming, financial transactions, non-HTTP protocols
Layer 7 (L7) - Application Layer:
- •Routes based on HTTP URLs, headers, cookies, request body
- •Full application data visibility, SSL/TLS termination, caching, WAF integration
- •Content-based routing capabilities
- •Use for: Web applications, REST APIs, microservices, GraphQL endpoints, complex routing logic
For detailed comparison including performance benchmarks and hybrid approaches, see references/l4-vs-l7-comparison.md.
Load Balancing Algorithms
| Algorithm | Distribution Method | Use Case |
|---|---|---|
| Round Robin | Sequential | Stateless, similar servers |
| Weighted Round Robin | Capacity-based | Different server specs |
| Least Connections | Fewest active connections | Long-lived connections |
| Least Response Time | Fastest server | Performance-sensitive |
| IP Hash | Client IP-based | Session persistence |
| Resource-Based | CPU/memory metrics | Varying workloads |
Health Check Types
Shallow (Liveness): Is the process alive?
- •Endpoint:
/health/liveor/live - •Returns: 200 if process running
- •Use for: Process monitoring, container health
Deep (Readiness): Can the service handle requests?
- •Endpoint:
/health/readyor/ready - •Validates: Database, cache, external API connectivity
- •Use for: Load balancer routing decisions
Health Check Hysteresis: Different thresholds for marking up vs down to prevent flapping
- •Example: 3 failures to mark down, 2 successes to mark up
For complete health check implementation patterns, see references/health-check-strategies.md.
Cloud Load Balancers
AWS Load Balancing
Application Load Balancer (ALB) - Layer 7:
- •Use for: HTTP/HTTPS applications, microservices, WebSocket
- •Features: Path/host/header routing, AWS WAF integration, Lambda targets
- •Choose when: Content-based routing needed
Network Load Balancer (NLB) - Layer 4:
- •Use for: Ultra-low latency (<1ms), TCP/UDP, static IPs, millions RPS
- •Features: Preserves source IP, TLS termination
- •Choose when: Non-HTTP protocols, performance critical
Global Accelerator - Layer 4 Global:
- •Use for: Multi-region applications, global users, DDoS protection
- •Features: Anycast IPs, automatic regional failover
GCP Load Balancing
Application LB (L7): Global HTTPS LB, Cloud CDN integration, Cloud Armor (WAF/DDoS) Network LB (L4): Regional TCP/UDP, pass-through balancing, session affinity Cloud Load Balancing: Single anycast IP, global distribution, backend buckets
Azure Load Balancing
Application Gateway (L7): WAF integration, URL-based routing, SSL termination, autoscaling Load Balancer (L4): Basic and Standard SKUs, health probes, HA ports Traffic Manager (Global): DNS-based routing (priority, weighted, performance, geographic)
For complete cloud provider configurations and Terraform examples, see references/cloud-load-balancers.md.
Self-Managed Load Balancers
NGINX
Best for: General-purpose HTTP/HTTPS load balancing, web application stacks
Capabilities:
- •HTTP reverse proxy with multiple algorithms
- •TCP/UDP stream load balancing
- •SSL/TLS termination
- •Passive health checks (open source), active health checks (NGINX Plus)
- •Cookie-based sticky sessions (NGINX Plus)
Basic configuration:
upstream backend {
least_conn;
server backend1.example.com:8080 weight=3;
server backend2.example.com:8080 weight=2;
keepalive 32;
}
server {
listen 80;
location / {
proxy_pass http://backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
For complete NGINX patterns and advanced configurations, see references/nginx-patterns.md.
HAProxy
Best for: Maximum performance, database load balancing, resource efficiency
Capabilities:
- •Highest raw throughput, lowest memory footprint
- •10+ load balancing algorithms
- •Sophisticated health checks (HTTP, TCP, Redis, MySQL, etc.)
- •Cookie or IP-based persistence
Basic configuration:
frontend http_front
bind *:80
default_backend web_servers
backend web_servers
balance roundrobin
option httpchk GET /health
server web1 192.168.1.101:8080 check
server web2 192.168.1.102:8080 check
For complete HAProxy patterns, see references/haproxy-patterns.md.
Envoy
Best for: Microservices, Kubernetes, service mesh integration
Capabilities:
- •Cloud-native design with dynamic configuration (xDS APIs)
- •Circuit breakers, retries, timeouts
- •Advanced health checks (TCP, HTTP, gRPC)
- •Excellent observability
For complete Envoy patterns, see references/envoy-patterns.md.
Traefik
Best for: Docker/Kubernetes environments, dynamic configuration, ease of use
Capabilities:
- •Automatic service discovery
- •Native Kubernetes integration
- •Built-in Let's Encrypt support
- •Middleware system (auth, rate limiting)
For complete Traefik patterns, see references/traefik-patterns.md.
Kubernetes Ingress Controllers
Selection Guide
| Controller | Best For | Strengths |
|---|---|---|
| NGINX Ingress (F5) | General purpose | Stability, wide adoption, mature features |
| Traefik | Dynamic environments | Easy configuration, service discovery |
| HAProxy Ingress | High performance | Advanced L7 routing, reliability |
| Envoy (Contour/Gateway) | Service mesh | Rich L7 features, extensibility |
| Kong | API-heavy apps | JWT auth, rate limiting, plugins |
| Cloud Provider | Single-cloud | Native cloud integration |
Basic Ingress Example
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-ingress
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/affinity: "cookie"
spec:
ingressClassName: nginx
tls:
- hosts:
- app.example.com
secretName: app-tls
rules:
- host: app.example.com
http:
paths:
- path: /api
pathType: Prefix
backend:
service:
name: api-service
port:
number: 80
- path: /
pathType: Prefix
backend:
service:
name: web-service
port:
number: 80
For complete Kubernetes ingress examples and Gateway API patterns, see references/kubernetes-ingress.md.
Session Persistence
Sticky Sessions (Use Sparingly)
Cookie-Based: Load balancer sets cookie to track server affinity
- •Accurate routing, works with NAT/proxies
- •HTTP only, adds cookie overhead
IP Hash: Hash client IP to select backend server
- •No cookie required, works for non-HTTP
- •Poor distribution with NAT/proxies
Drawbacks: Uneven load distribution, session lost on server failure, complicates scaling
Shared Session Store (Recommended)
Architecture: Stateless application servers + centralized session storage (Redis, Memcached)
Benefits:
- •No sticky sessions needed
- •True load balancing
- •Server failures don't lose sessions
- •Horizontal scaling trivial
Client-Side Tokens (Best for APIs)
JWT (JSON Web Tokens): Server generates signed token, client stores and sends with requests
Benefits:
- •Fully stateless servers
- •Perfect load balancing
- •No session storage needed
For complete session management patterns and code examples, see references/session-persistence.md.
Global Load Balancing
GeoDNS Routing
Route users to nearest server based on geographic location:
- •DNS returns different IPs based on client location
- •Reduces latency, supports compliance and regional content
- •Implementation: AWS Route 53, GCP Cloud DNS, Azure Traffic Manager
Multi-Region Failover
Primary/secondary region configuration:
- •Health checks determine primary region health
- •Automatic DNS failover to secondary
- •Transparent to clients
CDN Integration
Combine load balancing with CDN:
- •GeoDNS routes to closest CDN PoP
- •CDN caches content globally
- •Origin load balancing for cache misses
For complete global load balancing examples with Terraform, see references/global-load-balancing.md.
Decision Frameworks
L4 vs L7 Selection
Choose L4 when:
- •Protocol is TCP/UDP (not HTTP)
- •Ultra-low latency critical (<1ms)
- •High throughput required (millions RPS)
- •Client source IP preservation needed
Choose L7 when:
- •Protocol is HTTP/HTTPS
- •Content-based routing needed (URL, headers)
- •SSL termination required
- •WAF integration needed
- •Microservices architecture
Cloud vs Self-Managed
Choose Cloud-Managed when:
- •Single cloud deployment
- •Auto-scaling required
- •Team lacks load balancer expertise
- •Managed service preferred
Choose Self-Managed when:
- •Multi-cloud or hybrid deployment
- •Advanced routing requirements
- •Cost optimization important
- •Full control needed
- •Vendor lock-in avoidance
Self-Managed Selection
- •NGINX: General-purpose, web stacks, HTTP/3 support
- •HAProxy: Maximum performance, database LB, lowest resource usage
- •Envoy: Microservices, service mesh, dynamic configuration
- •Traefik: Docker/Kubernetes, automatic discovery, easy configuration
Configuration Examples
Complete working examples available in examples/ directory:
Cloud Providers:
- •
examples/aws/alb-terraform.tf- AWS ALB with path-based routing - •
examples/aws/nlb-terraform.tf- AWS NLB for TCP load balancing
Self-Managed:
- •
examples/nginx/http-load-balancing.conf- NGINX HTTP reverse proxy - •
examples/haproxy/http-lb.cfg- HAProxy configuration - •
examples/envoy/basic-lb.yaml- Envoy cluster configuration - •
examples/traefik/kubernetes-ingress.yaml- Traefik IngressRoute
Kubernetes:
- •
examples/kubernetes/nginx-ingress.yaml- NGINX Ingress with TLS - •
examples/kubernetes/traefik-ingress.yaml- Traefik IngressRoute - •
examples/kubernetes/gateway-api.yaml- Gateway API configuration
Monitoring and Observability
Key Metrics
Throughput: Requests per second, bytes transferred, connection rate Latency: Request duration (p50, p95, p99), backend response time, SSL handshake time Errors: HTTP error rates (4xx, 5xx), backend connection failures, health check failures Resource Utilization: CPU, memory, active connections, connection queue depth Health: Healthy/unhealthy backend count, health check success rate
Load Balancer Logs
Enable access logs for request/response details, client IPs, response times, error tracking
- •AWS ALB: Store in S3, analyze with Athena
- •NGINX: Custom log format, ship to centralized logging
- •HAProxy: Syslog integration, structured logging
Troubleshooting
Uneven Load Distribution
Symptoms: One server receives disproportionate traffic Causes: Sticky sessions with few clients, IP hash with NAT concentration, long-lived connections Solutions: Switch to least connections, disable sticky sessions, implement connection draining
Health Check Flapping
Symptoms: Servers rapidly transition between healthy/unhealthy Causes: Health check timeout too short, threshold too low, network instability Solutions: Increase interval and timeout, implement hysteresis, use deep health checks
Session Loss After Failover
Symptoms: Users logged out when server fails Causes: Sticky sessions without replication, in-memory sessions Solutions: Implement shared session store (Redis), use client-side tokens (JWT)
Integration Points
Related Skills:
- •
infrastructure-as-code- Deploy load balancers via Terraform/Pulumi - •
kubernetes-operations- Ingress controllers for K8s traffic management - •
network-architecture- Network design and topology for load balancing - •
deploying-applications- Blue-green and canary deployments via load balancers - •
observability- Load balancer metrics, access logs, distributed tracing - •
security-hardening- WAF integration, rate limiting, DDoS protection - •
service-mesh- Envoy as both ingress and service mesh proxy - •
implementing-tls- TLS termination and certificate management
Quick Reference
Selection Matrix
| Use Case | Recommended Solution |
|---|---|
| HTTP web app (AWS) | ALB |
| Non-HTTP protocol (AWS) | NLB |
| Kubernetes HTTP ingress | NGINX Ingress or Traefik |
| Maximum performance | HAProxy |
| Service mesh | Envoy |
| Docker Swarm | Traefik |
| Multi-cloud portable | NGINX or HAProxy |
| Global distribution | CloudFlare, AWS Global Accelerator |
Algorithm Selection
| Traffic Pattern | Algorithm |
|---|---|
| Stateless, similar servers | Round Robin |
| Stateless, different capacity | Weighted Round Robin |
| Long-lived connections | Least Connections |
| Performance-sensitive | Least Response Time |
| Session persistence needed | IP Hash or Cookie |
| Varying server load | Resource-Based |
Health Check Configuration
| Service Type | Check Type | Interval | Timeout |
|---|---|---|---|
| Web app | HTTP /health | 10s | 3s |
| API | HTTP /health/ready | 10s | 5s |
| Database | TCP connect | 5s | 2s |
| Critical service | HTTP deep check | 5s | 3s |
| Background worker | HTTP /live | 30s | 5s |
Summary
Load balancing is essential for distributing traffic, ensuring high availability, and enabling horizontal scaling. Choose L4 for raw performance and non-HTTP protocols, L7 for intelligent content-based routing. Prefer cloud-managed load balancers for simplicity and auto-scaling, self-managed for multi-cloud portability and advanced features. Implement proper health checks with hysteresis, avoid sticky sessions when possible, and monitor key metrics continuously.
For deployment patterns, see examples in examples/aws/, examples/nginx/, examples/kubernetes/, and other provider directories.