Low-Latency Audit

Deep compliance audit driven by AI reading perf report text + source code directly.

Shell scripts collect data (perf record/report). AI does the analysis. This audit:

•Reads perf report text to understand what code is hot (sample percentages, call chains)
•Reads source code to understand function boundaries, templates, lambdas
•Identifies banned patterns in hot code with function-level precision
•Produces two outputs: identification report (what is hot, what patterns exist) and fix suggestions (how to address findings, for human review)

When to Use

•After implementing hot/warm path code — final compliance gate
•Before marking implementation tasks complete (touching HOT/WARM code)
•When spec latency claims seem ungrounded — cross-check with data
•Periodic review — catch drift as code evolves

When NOT to Use

•Spec quality review — use /spec-review instead
•Running benchmarks — run them first, then use this to interpret

Inputs

•No argument: audit all hot/warm path code in the project
•File path: audit a specific file
•--spec: also audit spec files for latency claim compliance

Workflow

Phase 1: Profiling Data Check

•Check profile-results/perf-reports/ exists and has .txt files
•If missing: warn "No profiling data — run tools/profile-hot-path.sh". Cannot identify hot paths without data. Do NOT fall back to directory-based guessing.
•If present: read the perf report text files
•Check freshness: compare timestamp against most recent .h/.cpp modification
•If stale: warn "Profiling data may be outdated — consider re-running"
•If profile-results/flamegraph.svg exists, reference it for visual analysis

Phase 2: Hot-Path Identification (AI-Driven)

Read perf report text files in profile-results/perf-reports/. Each file contains:

•Default view: symbol-level profiling (comm, dso, symbol with sample percentages)
•Source file attribution: which source files contribute to hot code

For each perf report:

•Identify project source files in the "Source file attribution" section
•Read those source files using Read tool
•Cross-reference symbols from the default view with source code to identify specific hot functions
•For each hot function, check for banned patterns
•
Classify findings:
- •Banned pattern found: report file:line, pattern type, sample context
- •Design constraint: pattern exists due to external requirements (e.g., exchange JSON protocol requires float parsing) — report as finding, do NOT filter out
•Coverage gap detection: identify modules/elements that have no corresponding benchmark. These represent blind spots in hot-path coverage. Recommend writing benchmarks.

Key principle: report ALL findings. Do not judge whether a finding is "acceptable" or "fixable" during identification. That is a separate concern for the fix suggestions section.

Do NOT call check-hot-path.sh — that is a freshness reminder, not an analysis tool.

Phase 3: Fix Suggestions

For each finding from Phase 2, provide a fix suggestion:

•Actionable fix: concrete code change (e.g., "convert float price to int64_t tick units")
•Design constraint: explain why the pattern exists, suggest isolation strategies (e.g., "move float→int conversion to a dedicated function, minimize hot-path exposure")
•No fix needed: explain why (e.g., "compiler intrinsic, not actual floating point")

Fix suggestions are for human review — they are recommendations, not actions to take.

Phase 4: Spec Compliance (with `--spec`)

For each HOT/WARM element described in specs:

•
Latency budget is explicit and grounded
- •Every stage has a ns/us budget
- •Budget is labeled: [M]easured, [D]esign estimate, or [T]heoretical
- •No unqualified claims ("fast", "low latency", "approximately")
•
Zone classification is consistent
- •Element classified as HOT/WARM/COLD in spec matches profiling data
- •If spec says HOT but source file doesn't appear in perf reports — investigate
•
Data structures fit target cache level
- •Hot-path structs (<=64 bytes) should fit L1
- •Per-exchange book data should fit L2 or use prefetch
- •Total SHM footprint should fit L3 or use huge pages
•
I/O model matches zone
- •HOT: no blocking syscalls (verify tier-specific behavior)
- •WARM: non-blocking allowed, no mutexes
- •COLD/CONSTRAINED: anything goes

Phase 5: Design Compliance

•CRTP verification: hot-path polymorphism uses CRTP, not virtual
•Integer arithmetic: prices/quantities use int64_t, band walk uses __int128
•
Cache layout audit: for every struct on HOT/WARM path, check:
- •sizeof and alignas — documented and appropriate for use case
- •Cross-thread atomic fields on separate cache lines (alignas(64)) — no false sharing
- •static_assert(sizeof(...)) present — guards against silent struct growth
- •Struct fits target cache level: <=64B for L1-hot, <=1KB for L2, prefetch for larger
- •Producer/consumer fields isolated (e.g., write_pos and read_pos on different cache lines)

Phase 6: Measurement Verification

•Cross-reference spec performance claims with benchmark results and profiling data
•For each claim: find corresponding benchmark in test-reports/
•Classify: MEETS | EXCEEDS | MISSES | NO_DATA
•Flag any claim without measured backing

Zone Model (Reference)

code

HOT         <10us     0 alloc, 0 syscall, 0 indirect call, 0 float, 0 exception
WARM        <500us    same as HOT + larger sequential scans with prefetch
COLD        ms-level  STL, exceptions, heap alloc, virtual all OK
CONSTRAINED imposed   accept cost, isolate from HOT/WARM (gRPC, HTTP/2, TLS)

Banned Patterns (HOT/WARM)

Category	Patterns
Indirect calls	`virtual`, `std::function`, `std::any`, `dynamic_cast`, `typeid`
Heap allocation	`new`, `delete`, `malloc`, `make_shared`, `make_unique`, `std::string` ctor, `vector::push_back`
System calls	`clock_gettime`, `gettimeofday`, `std::cout`, `printf`
Floating point	`double`, `float`, `stod`, `atof`
Exceptions	`throw`, `try`, `catch`, `.at()`
Blocking	`std::mutex`, `std::lock_guard`, `pthread_mutex_lock`

Output Format

markdown

# Low-Latency Audit Report

**Scope**: [all code | specific file]
**Date**: YYYY-MM-DD HH:MM
**Profiling data**: [present, dated YYYY-MM-DD | missing | stale]

## Summary

| Category | Findings | Coverage gaps |
|----------|----------|---------------|
| Banned patterns | N | - |
| Spec compliance | N | - |
| Design compliance | N | - |
| Measurement gaps | - | N |

## Hot-Path Identification

### File: path/to/file.h (X.X% samples in benchmark_name)
1. **line:N** — `float` in `FeedElement::parse_price()`
   Context: exchange sends JSON with float prices, conversion required
2. **line:M** — `std::string` ctor in `FeedElement::on_message()`
   Context: WebSocket frame handling

## Fix Suggestions

1. **file.h:N** — `float` → convert to int64_t tick units at parse boundary
   Type: design constraint — float→int conversion unavoidable, minimize scope
2. **file.h:M** — `std::string` → use std::string_view or pre-allocated buffer
   Type: actionable fix

## Coverage Gaps
- AggregatorElement: no benchmark exists
- GrpcBridgeElement: COLD zone, benchmark not required

Principles

•Identification ≠ optimization: report all findings without judging fixability. Fix suggestions are separate.
•AI reads perf report text, shell runs perf: clear separation of concerns
•Measured > estimated > ungrounded: prefer benchmark data over theoretical claims
•Re-profile after changes: new modules or optimizations invalidate old profiling data