The Performance Tuning Codex
"Premature optimization is the root of all evil. But mature optimization is the root of all delight."
The Performance Philosophy
The Golden Rules
code
RULE 1: MEASURE FIRST Don't guess where the bottleneck is. Profile. Measure. Prove. RULE 2: OPTIMIZE THE RIGHT THING 80% of time is spent in 20% of code. Find that 20%. RULE 3: SET TARGETS "Faster" is not a goal. "Under 200ms" is a goal. RULE 4: REGRESSION PREVENTION Performance is easy to lose. Benchmark continuously.
The Optimization Hierarchy
code
╔═══════════════════════════════════════════════════════════════════╗ ║ OPTIMIZATION HIERARCHY ║ ║ (Optimize in this order) ║ ╠═══════════════════════════════════════════════════════════════════╣ ║ ║ ║ 1. ALGORITHM │ O(n²) → O(n log n) = massive wins ║ ║ 2. DATA STRUCTURE │ Right structure for access pattern ║ ║ 3. I/O │ Network, disk, database calls ║ ║ 4. MEMORY │ Allocation, garbage collection ║ ║ 5. CPU │ Hot loops, cache efficiency ║ ║ ║ ║ (Don't optimize #5 if #1-4 are the problem) ║ ║ ║ ╚═══════════════════════════════════════════════════════════════════╝
Profiling
Types of Profiling
code
CPU PROFILING: • What functions take the most time? • Where are the hot paths? • What's the call graph? MEMORY PROFILING: • Where is memory allocated? • What's causing garbage collection? • Are there memory leaks? I/O PROFILING: • What queries are slow? • What network calls are made? • What files are accessed? TRACE PROFILING: • What's the full request lifecycle? • Where do requests spend time? • What's the concurrency pattern?
The Profiling Process
code
┌─────────────────────────────────────────────────────────────────┐ │ THE PROFILING CYCLE │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ 1. ESTABLISH BASELINE │ │ Measure current performance │ │ Record metrics: latency, throughput, resource usage │ │ │ │ 2. SET TARGET │ │ Define acceptable performance │ │ "P95 latency < 200ms" │ │ │ │ 3. PROFILE │ │ Identify bottlenecks │ │ Focus on top 3 issues │ │ │ │ 4. HYPOTHESIZE │ │ Why is this slow? │ │ What would make it faster? │ │ │ │ 5. OPTIMIZE │ │ Make ONE change │ │ Keep it isolated │ │ │ │ 6. MEASURE │ │ Did it help? │ │ Did it hurt anything else? │ │ │ │ 7. REPEAT │ │ Until target reached │ │ │ └─────────────────────────────────────────────────────────────────┘
Common Performance Patterns
The N+1 Query Problem
code
BAD: N+1 queries
┌──────────────────────────────────────────────────────────────┐
│ // Get all users (1 query) │
│ users = db.query("SELECT * FROM users") │
│ │
│ // For each user, get their orders (N queries) │
│ for user in users: │
│ orders = db.query("SELECT * FROM orders WHERE user_id=?")│
└──────────────────────────────────────────────────────────────┘
GOOD: Eager loading
┌──────────────────────────────────────────────────────────────┐
│ // Single query with JOIN │
│ SELECT users.*, orders.* │
│ FROM users │
│ LEFT JOIN orders ON orders.user_id = users.id │
│ │
│ // Or batch loading │
│ SELECT * FROM orders WHERE user_id IN (1, 2, 3, 4, 5) │
└──────────────────────────────────────────────────────────────┘
Caching Strategies
code
╔═══════════════════════════════════════════════════════════════════╗ ║ CACHING STRATEGIES ║ ╠═══════════════════════════════════════════════════════════════════╣ ║ ║ ║ CACHE-ASIDE (Lazy Loading) ║ ║ ┌─────────┐ ║ ║ │ Request │──┬──▶ Cache Hit ──▶ Return ║ ║ └─────────┘ │ ║ ║ └──▶ Cache Miss ──▶ DB ──▶ Cache ──▶ Return ║ ║ ║ ║ WRITE-THROUGH ║ ║ ┌─────────┐ ║ ║ │ Write │──▶ Cache ──▶ DB ──▶ Confirm ║ ║ └─────────┘ ║ ║ ║ ║ WRITE-BEHIND (Async) ║ ║ ┌─────────┐ ║ ║ │ Write │──▶ Cache ──▶ Confirm ║ ║ └─────────┘ │ ║ ║ └──▶ [Later] ──▶ DB ║ ║ ║ ╚═══════════════════════════════════════════════════════════════════╝ CACHE INVALIDATION: • TTL (Time To Live) - Simple but may serve stale data • Event-based - Invalidate on writes • Tag-based - Group related items
Connection Pooling
code
WITHOUT POOLING: ┌──────────┐ ┌──────────┐ │ Request │──▶──│ Connect │──▶ 50-100ms overhead └──────────┘ └──────────┘ WITH POOLING: ┌──────────┐ ┌──────────────┐ ┌──────────┐ │ Request │──▶──│ Pool Manager │──▶──│ Reuse │──▶ ~0ms └──────────┘ └──────────────┘ └──────────┘ POOL CONFIGURATION: • Min connections: Keep warm for base load • Max connections: Limit to prevent exhaustion • Idle timeout: Release unused connections • Connection lifetime: Prevent stale connections
Lazy Loading
code
EAGER (Load everything): ┌────────────────────────────────────────────────────────┐ │ class User: │ │ def __init__(self, id): │ │ self.profile = load_profile(id) # Always │ │ self.orders = load_orders(id) # Always │ │ self.preferences = load_prefs(id) # Always │ └────────────────────────────────────────────────────────┘ LAZY (Load on demand): ┌────────────────────────────────────────────────────────┐ │ class User: │ │ def __init__(self, id): │ │ self._id = id │ │ self._orders = None │ │ │ │ @property │ │ def orders(self): │ │ if self._orders is None: │ │ self._orders = load_orders(self._id) │ │ return self._orders │ └────────────────────────────────────────────────────────┘
Database Optimization
Index Optimization
code
WHEN TO INDEX: ✓ Columns in WHERE clauses ✓ Columns in JOIN conditions ✓ Columns in ORDER BY ✓ Columns with high selectivity WHEN NOT TO INDEX: ✗ Small tables (full scan is faster) ✗ Columns with low selectivity (gender, boolean) ✗ Tables with heavy writes (index maintenance cost) ✗ Columns rarely queried COMPOSITE INDEX ORDER: • Equality conditions first • Range conditions last • Most selective first INDEX (status, created_at) -- status = 'active' AND created_at > ?
Query Optimization
code
EXPLAIN ANALYZE: Always explain before optimizing. ┌────────────────────────────────────────────────────────────────┐ │ EXPLAIN ANALYZE │ │ SELECT * FROM orders │ │ WHERE user_id = 123 AND status = 'pending' │ │ ORDER BY created_at DESC │ │ LIMIT 10; │ │ │ │ Look for: │ │ • Seq Scan (bad on large tables) │ │ • Index Scan (good) │ │ • Sort (expensive if not indexed) │ │ • Rows vs estimated rows (accuracy of stats) │ └────────────────────────────────────────────────────────────────┘ COMMON FIXES: • Add missing indexes • Rewrite subqueries as JOINs • Use LIMIT for pagination • Avoid SELECT * in production • Partition large tables
Frontend Performance
Critical Rendering Path
code
┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐
│ HTML │──▶──│ CSS │──▶──│ JS │──▶──│ Render │
│ Parse │ │ Parse │ │ Execute │ │ Paint │
└─────────┘ └─────────┘ └─────────┘ └─────────┘
│ │ │
▼ ▼ ▼
DOM CSSOM Execute
Build Build & Modify
OPTIMIZATION:
1. Minimize critical resources
2. Minimize critical bytes
3. Minimize critical path length
Core Web Vitals
code
LCP (Largest Contentful Paint): Target: < 2.5s • Optimize images • Preload critical resources • Use CDN FID (First Input Delay): Target: < 100ms • Break up long tasks • Defer non-critical JS • Use web workers CLS (Cumulative Layout Shift): Target: < 0.1 • Set image dimensions • Reserve space for ads • Avoid inserting content above fold
Bundle Optimization
code
CODE SPLITTING:
// Instead of one large bundle
import { everything } from 'huge-library';
// Load on demand
const HeavyComponent = lazy(() => import('./HeavyComponent'));
TREE SHAKING:
// Bad: imports everything
import _ from 'lodash';
// Good: imports only what's used
import { debounce } from 'lodash-es';
COMPRESSION:
• Gzip: 70-90% reduction
• Brotli: 15-20% better than Gzip
• Enable on server and CDN
Concurrency & Parallelism
Async Patterns
code
SEQUENTIAL (Slow): ┌────────────────────────────────────────────────────────────────┐ │ result1 = await fetchUser() // 100ms │ │ result2 = await fetchOrders() // 150ms │ │ result3 = await fetchProducts() // 120ms │ │ // Total: 370ms │ └────────────────────────────────────────────────────────────────┘ PARALLEL (Fast): ┌────────────────────────────────────────────────────────────────┐ │ [user, orders, products] = await Promise.all([ │ │ fetchUser(), │ │ fetchOrders(), │ │ fetchProducts() │ │ ]) │ │ // Total: 150ms (slowest call) │ └────────────────────────────────────────────────────────────────┘
Rate Limiting & Backpressure
code
RATE LIMITING: ┌────────────────────────────────────────────────────────────────┐ │ Token Bucket Algorithm: │ │ │ │ • Bucket has capacity (e.g., 100 tokens) │ │ • Tokens added at fixed rate (e.g., 10/second) │ │ • Each request consumes a token │ │ • No tokens = request rejected │ └────────────────────────────────────────────────────────────────┘ BACKPRESSURE: ┌────────────────────────────────────────────────────────────────┐ │ When producer is faster than consumer: │ │ │ │ Options: │ │ • Drop: Discard excess (lossy) │ │ • Buffer: Queue until processed (memory risk) │ │ • Sample: Process every Nth item │ │ • Slow down: Signal producer to wait │ └────────────────────────────────────────────────────────────────┘
Monitoring & Metrics
Key Metrics
code
THE FOUR GOLDEN SIGNALS:
┌─────────────────────────────────────────────────────────────┐
│ 1. LATENCY │ Time to serve a request │
│ 2. TRAFFIC │ Requests per second │
│ 3. ERRORS │ Rate of failed requests │
│ 4. SATURATION │ How "full" the service is │
└─────────────────────────────────────────────────────────────┘
PERCENTILES:
• P50 (median): Typical experience
• P95: Most users' worst experience
• P99: Tail latency (important!)
• Max: Absolute worst case
Note: Average is misleading.
A few slow requests hide in the average.
Benchmarking
code
MICRO-BENCHMARKS: • Test specific functions • Isolate from I/O • Run many iterations • Beware of JIT warmup LOAD TESTING: • Simulate realistic traffic • Measure at various loads • Find the breaking point • Test failure scenarios TOOLS: • k6, Artillery, Locust (load testing) • wrk, hey (HTTP benchmarking) • hyperfine (CLI benchmarking)
Quick Reference
Performance Checklist
code
□ Profiled to find actual bottlenecks □ Set measurable performance targets □ Optimized hot paths first □ Added appropriate caching □ Minimized I/O operations □ Used connection pooling □ Indexed frequently queried columns □ Implemented lazy loading where appropriate □ Set up performance monitoring □ Established performance regression tests
Common Performance Wins
code
| Problem | Solution | |----------------------|-----------------------------| | N+1 queries | Eager loading, batch | | Slow queries | Add indexes, optimize SQL | | Large payloads | Pagination, compression | | Repeated computation | Caching, memoization | | Synchronous waits | Async, parallel execution | | Cold starts | Warmup, connection pools | | Large bundles | Code splitting, tree shake | | Slow images | Lazy load, WebP, CDN |
The Performance Mantras
code
"Measure first, optimize second" "The fastest code is code that doesn't run" "Cache invalidation is hard; TTL is your friend" "Profile in production, not just development" "Optimize for the common case"
"Performance is not about making things fast. It's about removing what makes things slow."