low-latency-systems

在后端与实时系统中设计、诊断并优化低延迟请求路径。在分析p50/p95/p99延迟回归、减少排队与锁竞争、调优网络/序列化开销、验证尾部延迟的改进，或以严格的百分位阈值准备延迟签核证据时使用此技能。

SKILL.md

--- frontmatter

name: low-latency-systems
description: Design, diagnose, and optimize low-latency request paths in backend and realtime systems. Use when profiling p50/p95/p99 latency regressions, reducing queueing and lock contention, tuning network/serialization overhead, validating tail-latency improvements, or preparing latency sign-off evidence with strict percentile gates.

Low Latency Systems

Use this skill to turn latency incidents and regressions into measurable, reproducible fixes.

Workflow

•Lock measurement context first.

•Capture workload, concurrency, payload sizes, warmup policy, and hardware/runtime settings.
•Keep baseline and current runs environment-compatible.

•Decompose latency path.

•Split end-to-end latency into ingress, queue, compute, storage/network, and egress components.
•Prioritize tail-latency contributors over average-only improvements.

•Apply targeted latency fixes.

•Reduce blocking, contention, and unbounded queues.
•Reduce allocations/serialization overhead in hot paths.
•Use batching, caching, and async boundaries only when measured beneficial.

•Validate percentile regressions.

•Compare baseline vs current percentiles (p50, p95, p99, optional p999).
•Gate release on configured regression thresholds.

•Produce sign-off output.

•Provide measured deltas, affected components/files, and residual risks.
•Include exact rerun commands for verification.

Commands

bash

python3 scripts/compare_latency_runs.py \
  --baseline <baseline.json> \
  --current <current.json> \
  --threshold-pct 5

Treat non-zero exits as blocker regressions.

Output Contract

Return:

•Latency Baseline: environment/workload assumptions.
•Findings: percentile deltas and hotspot classes.
•Optimization Plan: exact changes with expected impact.
•Verification: rerun commands and regression gates.
•Residual Risks: variance or unresolved tail spikes.

References

•references/workflow.md: detailed low-latency process.
•references/latency-playbook.md: bottleneck-to-fix mapping.
•references/signoff-template.md: concise sign-off format.

Execution Rules

•Prioritize tail latency (p95/p99) when evaluating user impact.
•Keep measurement setup stable across comparisons.
•Require before/after evidence for each claimed improvement.
•Escalate threshold breaches as blockers.