Performance and Scalability Fitness Review

Analyze the codebase (or specified files/modules) for performance and scalability fitness. Identify hot paths, inefficient patterns, and scaling bottlenecks using evidence from the code.

Workflow

•
Identify hot paths -- Use Grep/Glob to find request handlers, API endpoints, background jobs, and data processing pipelines. These are the entry points where performance matters most.
•
Trace data flow -- For each hot path, trace how data moves: what gets queried, what gets cached, what gets computed. Map the critical path from request to response.
•
Analyze algorithmic complexity -- Look for nested loops, repeated scans, unbounded collections, and O(n^2) patterns on hot paths. Check that data structures match access patterns (hash maps for lookups, sorted structures for range queries, bounded collections for caches).
•
Audit database interactions -- Find N+1 query patterns (queries inside loops), missing indexes on filtered/joined columns, full table scans, missing connection pooling, and unbounded result sets.
•
Evaluate caching -- Check for missing caches on repeated expensive work, improper invalidation, missing TTLs, cache stampede risk (hot key expiry without coalescing), unbounded cache growth, and cache penetration from high-cardinality keys.
•
Assess scalability readiness -- Look for in-process state that prevents horizontal scaling, shared bottlenecks (single database, single queue), synchronous calls that could be async, and missing backpressure or rate limiting.
•
Check resource utilization -- Identify unbounded memory growth (collections without limits, missing eviction), connection/thread pool sizing, large payload serialization on hot paths, and missing timeouts on external calls.
•
Review data pipeline efficiency -- For batch or streaming pipelines, check processing patterns (full reload vs incremental/CDC), schema validation placement, error handling that stops entire pipelines, and monitoring gaps.
•
Score each dimension with specific file:line evidence.
•
Produce the report with scores, evidence, and prioritized action items.

Scoring Dimensions (1-10 each)

Evaluate and score each dimension with evidence from the code.

1. Algorithmic Efficiency

What to check:

•Big-O complexity of hot paths (request handlers, loops over data)
•Nested loops that create O(n^2) or worse behavior
•Linear scans where hash lookups or indexed searches would work
•Unbounded collections that grow with input size
•Data structures that mismatch their access patterns (e.g., lists used for lookups, hash maps used for ordered iteration)

What good looks like (8-10):

•Hot paths are O(n) or better
•Data structures match access patterns (hash maps for lookups, trees for ordered data)
•Collections have explicit size limits or eviction policies
•No nested loops over growing datasets on critical paths

What bad looks like (1-3):

•Nested loops over unbounded data on hot paths (O(n^2) or worse)
•Linear scans for lookups that should be hash-based
•Collections that grow without bounds in long-running processes
•Sorting or searching repeated inside loops instead of precomputing

2. Database Design

What to check:

•N+1 query patterns (database queries inside loops)
•Missing indexes on columns used in WHERE, JOIN, and ORDER BY
•SELECT * instead of specific columns on large tables
•Missing connection pooling
•Unbounded query results (no LIMIT)
•Full table scans on large tables
•Missing EXPLAIN plan analysis for complex queries
•Denormalization decisions and their trade-offs

What good looks like (8-10):

•Queries use indexes; EXPLAIN plans show index scans
•Batch fetching replaces per-item queries
•Connection pools are sized and monitored
•Result sets are bounded with pagination or LIMIT
•Query patterns match the data model

What bad looks like (1-3):

•Queries inside for-loops (N+1 pattern)
•No indexes on frequently filtered columns
•Unbounded SELECT * on tables that grow
•No connection pooling; connections created per request
•Cross-shard or cross-partition queries on hot paths

3. Caching Strategy

What to check:

•Whether expensive repeated work is cached
•TTL settings and whether they match data freshness requirements
•Cache invalidation strategy (TTL, explicit, versioned keys)
•Stampede protection for hot keys (request coalescing, jittered TTLs)
•Unbounded cache growth (missing eviction policies, no size limits)
•Cache penetration risk (high-cardinality or nonexistent keys bypassing cache)
•Negative caching for "not found" results
•Cache observability (hit rate, miss latency, eviction counts)

What good looks like (8-10):

•High-frequency reads are cached with appropriate TTLs
•Cache has bounded size with eviction policy (LRU or similar)
•Hot keys have stampede protection (single-flight or jittered expiry)
•Cache hit/miss rates are instrumented
•Invalidation strategy matches write patterns

What bad looks like (1-3):

•No caching on repeated expensive operations
•Unbounded in-memory caches that grow until OOM
•All keys expire at the same TTL causing synchronized misses
•No observability on cache behavior
•Cache treated as source of truth with no fallback

4. Scalability Readiness

What to check:

•In-memory session state or local file storage that prevents horizontal scaling
•Stateless vs stateful service design
•Database as a shared bottleneck (single writer, no read replicas, no sharding plan)
•Synchronous blocking calls to downstream services
•Missing backpressure, rate limiting, or circuit breakers
•Single points of contention (global locks, single queues, hot partitions)
•Ability to add instances behind a load balancer

What good looks like (8-10):

•Services are stateless; state is externalized to databases or caches
•Downstream calls have timeouts, retries with backoff, and circuit breakers
•Database can scale reads (replicas) and writes (partitioning) independently
•Work can be distributed across instances without coordination overhead
•Async processing for non-critical-path work

What bad looks like (1-3):

•Session state stored in server memory
•No timeouts on external service calls
•Single database instance with no replication or scaling plan
•Synchronous fan-out to many services on the critical path
•Global locks or single-threaded bottlenecks

5. Resource Utilization

What to check:

•Memory: unbounded collections, large object retention, missing eviction
•CPU: tight loops, busy-waiting, unthrottled background work
•Connections: pool sizing, leak detection, proper cleanup
•Serialization: large payloads on hot paths, unnecessary marshalling
•Timeouts: missing or overly generous timeouts on I/O operations
•Thread/goroutine leaks in long-running services

What good looks like (8-10):

•Collections have capacity limits and eviction
•Connection pools are sized based on downstream capacity
•All external calls have bounded timeouts
•Large payloads are paginated or streamed
•Background work is throttled and observable

What bad looks like (1-3):

•Hash maps or lists that grow without limit
•No timeouts on database or HTTP calls
•Connection pools with no max size
•Large responses serialized fully into memory
•No monitoring of memory, connection, or thread usage

6. Data Pipeline Efficiency

What to check:

•Full reload vs incremental processing (CDC, timestamps)
•Validation placement (early in pipeline vs late)
•Error handling (fail entire pipeline vs dead-letter problematic records)
•Monitoring and alerting on pipeline health
•Idempotency of pipeline stages
•Schema drift handling
•Data quality checks at pipeline boundaries

What good looks like (8-10):

•Incremental processing where possible (CDC, watermarks)
•Validation at extraction and transformation boundaries
•Failed records handled individually without stopping the pipeline
•Pipeline stages are idempotent and retryable
•Schema changes detected and handled gracefully

What bad looks like (1-3):

•Full table reloads on every run for large datasets
•No validation; bad data propagates to downstream consumers
•Single failure stops entire pipeline
•No monitoring; failures discovered when users complain
•No schema evolution strategy

Output Format

Write the report to docs/performance-review.md with this structure:

markdown

# Performance and Scalability Review

## Summary

Overall fitness score: X.X / 10 (average of dimensions)

| Dimension | Score | Key Finding |
|-----------|-------|-------------|
| Algorithmic Efficiency | X/10 | ... |
| Database Design | X/10 | ... |
| Caching Strategy | X/10 | ... |
| Scalability Readiness | X/10 | ... |
| Resource Utilization | X/10 | ... |
| Data Pipeline Efficiency | X/10 | ... |

## Detailed Findings

### Algorithmic Efficiency (X/10)
- Evidence: file:line references
- Issues found
- Recommendations

(repeat for each dimension)

## Top 5 Action Items (by impact)

1. [CRITICAL/HIGH/MEDIUM] Description -- file:line
2. ...

## Checklist Reference

See references/checklist.md for the full performance checklist.

Refer to the performance checklist at review-performance/references/checklist.md for detailed checks within each dimension.