java-jdbc-performance
Intent
Improve database interaction performance by reducing:
- •network round-trips,
- •per-statement overhead,
- •connection churn,
- •inefficient result fetching, while ensuring changes are measurable and regression-safe.
When to use
- •Slow queries at the application layer (even when DB looks okay)
- •High DB CPU due to too many small queries
- •Thread pool saturation caused by DB waits
- •Connection pool exhaustion or timeouts
- •Large result sets causing memory spikes or long tail latency
Scope
In scope
- •Connection pool configuration sanity (HikariCP)
- •PreparedStatement usage, batching, fetch size
- •Transaction scope and autocommit implications
- •Read streaming patterns for large result sets
- •Basic measurement and before/after benchmarking guidance
Out of scope
- •Schema/index/query-plan design (belongs to DB optimization + migrations skill)
- •ORM-level fetch shaping (see java-jpa-transaction-patterns)
Step 0 — Baseline measurement (no tuning without data)
Collect:
- •p50/p95/p99 latency per endpoint
- •DB time vs CPU time (if you have tracing/metrics)
- •Connection pool metrics: active, idle, pending threads
- •Query counts and average statement duration
- •Error counters: timeouts, transient failures, deadlocks
Minimal reproducible benchmark:
- •A single representative operation loop (N requests) with stable input data
- •Warm-up phase + measurement phase
- •Record:
- •throughput (ops/s)
- •latency distribution
- •CPU usage
- •GC metrics
Step 1 — Connection pool sanity (HikariCP baseline)
Goals:
- •avoid too-small pools causing queueing,
- •avoid too-large pools causing DB overload,
- •ensure lifecycle settings prevent stale connections.
Baseline rules:
- •maxPoolSize should align with DB capacity and app concurrency (not “CPU cores * 2” blindly)
- •set maxLifetime < DB idle timeout to avoid broken connections
- •prefer explicit connectionTimeout to fail fast under saturation
- •keep idleTimeout reasonable to avoid churn
- •use leakDetectionThreshold only for debugging (it adds overhead)
Step 2 — Reduce per-query overhead
Use PreparedStatement for repeated queries
- •ensures server-side plan caching (varies by DB)
- •avoids SQL string concatenation and injection risks
Avoid SELECT *
- •fetch only needed columns
- •reduces IO and object mapping overhead
Use server-side pagination for lists
- •LIMIT/OFFSET (or keyset pagination for deep paging)
- •never pull massive lists into memory
Step 3 — Batching writes (massive throughput gains)
Use batching for:
- •insert/update of many rows
- •bulk status updates
- •event/outbox inserts
Guidelines:
- •choose batch size (e.g., 100–1000) based on row width and DB
- •wrap batch in explicit transaction (autocommit off)
- •use addBatch/executeBatch
- •handle partial failures and retry carefully (idempotency keys where needed)
Example skeleton (JDBC):
- •conn.setAutoCommit(false)
- •PreparedStatement ps = conn.prepareStatement(...)
- •loop: ps.addBatch() if (i % batchSize == 0) ps.executeBatch()
- •ps.executeBatch()
- •conn.commit()
Step 4 — Fetch size and streaming reads
Large result sets:
- •default behavior often fetches too much into memory
- •configure fetch size and use forward-only ResultSet if supported
Guidelines:
- •setFetchSize(N) on Statement/PreparedStatement
- •for streaming:
- •use forward-only, read-only ResultSet
- •keep transaction open only as needed
- •avoid mapping entire result set into a list; process row-by-row
Important:
- •some drivers require autocommit=false for cursor-based fetching
- •verify driver-specific behavior before assuming streaming
Step 5 — Transaction and autocommit semantics
- •Autocommit=true means each statement is its own transaction
- •adds commit overhead
- •may defeat cursor streaming in some drivers
- •For batch writes and streaming reads:
- •set autocommit=false
- •commit explicitly
Step 6 — Network and timeouts
- •set query timeout where appropriate
- •ensure socket/read timeouts are configured at driver or datasource level
- •treat timeouts as signals: don’t just increase them; reduce load and improve queries
Verification and safety checks
- •Validate correctness:
- •row counts
- •idempotency for retries
- •Validate resource usage:
- •memory remains stable in streaming operations
- •connection pool does not get pinned by long transactions
- •Validate observability:
- •metrics show reduced query counts or reduced DB time
Output artifacts
- •A short “Performance Change Log”:
- •baseline metrics
- •changes applied (pool/batch/fetch)
- •after metrics
- •Code diffs: datasource config + JDBC call sites
- •A benchmark snippet or reproducible load script
Definition of Done (DoD)
- • Baseline and after numbers recorded (latency + throughput + pool metrics)
- • Connection pool settings reviewed against DB timeouts
- • Batching enabled where appropriate and tested
- • Fetch size/streaming approach validated with driver behavior
- • Regression tests added for correctness (especially around batching)
- • Rollback path defined (config toggles / feature flags)
Guardrails (What NOT to do)
- •Do not increase pool size blindly; you can DDoS your own DB
- •Do not batch with autocommit=true and expect real gains
- •Do not stream huge results while also doing slow per-row processing in the same transaction
- •Do not “optimize” without recording baseline and after metrics
Cursor usage (recommended)
Attach:
- •DataSource config (Hikari)
- •JDBC call sites (DAO/repository)
- •Production symptoms: pool metrics, timeouts, p95/p99 graphs Prompt snippet: “Use java-jdbc-performance. Propose measurable improvements (pool + batching + fetch size). Provide a micro-benchmark plan and regression tests.”