AgentSkillsCN

design-arbitrage

针对延迟敏感型交易与套利系统的设计——执行引擎、风险管理与跨市场交易模式。

SKILL.md
--- frontmatter
name: design-arbitrage
description: Latency-sensitive trading and arbitrage system design — execution engines, risk management, and cross-venue patterns

What I do

  • Provide architecture frameworks for latency-sensitive trading systems
  • Document tick-to-trade pipeline design and execution engine patterns
  • Define risk management as a first-class design constraint
  • Cover cross-venue and cross-chain arbitrage patterns

When to use me

Use this skill when designing trading systems, arbitrage bots, or latency-sensitive execution infrastructure. Pair with design-core for foundational design principles and decision frameworks.

Latency Budget Framework

Allocate your total latency budget across pipeline stages. Every microsecond matters -- measure, don't guess.

Pipeline StageTarget (CEX)Target (DEX)Optimization Lever
Market data ingestion<100 us<10 msBinary protocols, kernel bypass, co-location
Signal generation<50 us<5 msPre-computed tables, SIMD, branch-free logic
Risk check<10 us<1 msLock-free data structures, pre-validated limits
Order construction<20 us<5 msPre-built templates, connection pooling
Network transit<1 ms<100 msCo-location, direct market access, private mempools
Execution confirmation<5 ms1-12 sOptimistic execution, parallel confirmation

Rule: Measure end-to-end latency at p99, not p50. The tail kills profits.

Tick-to-Trade Pipeline Architecture

code
Market Data Feed(s)
  |
  v
Feed Handler (normalize, deduplicate, sequence)
  |
  v
Order Book Reconstruction (L2/L3 book maintenance)
  |
  v
Signal Engine (strategy logic, opportunity detection)
  |
  v
Risk Gate (pre-trade checks, position limits, exposure)
  |
  v
Execution Engine (order routing, smart order routing)
  |
  v
Confirmation Handler (fill tracking, position update)
  |
  v
Post-Trade (reconciliation, PnL, reporting)

Critical path: Feed Handler -> Signal -> Risk -> Execution. Everything else is off the hot path. Never add latency to the critical path for logging, metrics, or persistence.

Market Data Normalization & Distribution

  • Feed handlers -- One per venue. Normalize to internal format at the edge. Binary protocols (FIX/FAST, WebSocket binary) over JSON.
  • Order book reconstruction -- Maintain local order book from incremental updates. Detect gaps and request snapshots. Never trust stale books.
  • Multi-venue aggregation -- Merge books across venues for best bid/offer (BBO). Account for fees, latency, and fill probability.
  • Distribution -- Shared memory or lock-free ring buffers for intra-process. Kernel bypass (DPDK, io_uring) for inter-process.

Execution Engine Patterns

PatternLatencyComplexityBest For
Event-driven (single-threaded)LowestLowSimple strategies, single venue
Actor modelLowMediumMulti-strategy, multi-venue
Lock-free pipelineVery lowHighUltra-low-latency, dedicated hardware
Thread-per-venueMediumLowModerate latency requirements

Default choice: Event-driven single-threaded for simplicity. Move to lock-free pipeline only when measured latency demands it.

Smart Order Routing

  • Venue selection -- Route to venue with best price after fees. Factor in historical fill rates and latency.
  • Order splitting -- Split large orders across venues to minimize market impact. Use TWAP/VWAP for size.
  • Retry logic -- Rejected orders retry on alternate venues. Never retry without checking current position and risk limits.

Risk Management as Design Constraint

Risk checks are on the critical path. They must be fast AND correct.

Risk CheckEnforcementBypass =
Position limitsPer-instrument and portfolio-wideUnbounded loss exposure
Notional limitsMaximum value per order and per time windowSingle trade blows up account
Loss limitsDaily, hourly, per-strategy drawdown limitsBleeding capital on broken strategy
Rate limitsMaximum orders per second per venueExchange ban, API revocation
Kill switchHardware or software emergency stopNo way to stop a runaway system

Kill switch is non-negotiable. It must work independently of the trading system. Hardware kill switch preferred. Test it weekly.

Co-location & Infrastructure Decisions

  • Co-locate when latency is the primary competitive advantage
  • Central hub with low-latency links for cross-venue strategies
  • Own node for DEX strategies; connect to block builders / private mempools
  • Cloud in nearest region for moderate latency requirements
  • Hardware -- FPGA for nanosecond-critical feed handling; GPU for parallel signal computation; commodity hardware for everything else
  • Network -- Dedicated NICs, kernel bypass (DPDK), jumbo frames. Measure and minimize jitter, not just average latency.

Cross-venue / Cross-chain Arbitrage Patterns

PatternExecutionRiskLatency
CEX-CEXSimultaneous limit ordersLeg risk (partial fill)Microseconds
CEX-DEXCEX order + DEX swapLeg risk + MEV extractionMilliseconds-seconds
DEX-DEX (same chain)Atomic via flash loan or multicallNo leg risk if atomicBlock time
DEX-DEX (cross-chain)Bridge or intent-basedBridge risk + timing riskMinutes
Atomic (flash loan)Borrow, swap, repay in one txReverts if unprofitableBlock time

Atomic execution eliminates leg risk but limits you to single-chain, single-block opportunities. Non-atomic execution accesses more opportunities but requires hedging and position management.

Anti-Patterns

Anti-PatternWhy It FailsWhat To Do Instead
GC in hot pathStop-the-world pauses cause missed opportunitiesUse GC-free languages (C, Rust) or pre-allocate in Java/Go
Unnecessary serializationJSON/protobuf encoding adds microseconds per messageUse shared memory, zero-copy, or fixed-size binary formats
Blocking I/O in critical pathThread blocks waiting for network; latency spikesNon-blocking I/O, io_uring, or dedicated I/O threads
No kill switchRunaway system trades until account is emptyIndependent kill switch; test weekly
Untested failoverPrimary fails; backup has never been testedRegular failover drills; automated health checks
Logging on hot pathDisk I/O or lock contention in critical pathAsync logging with ring buffer; sample in hot path
Backtesting without slippageStrategy looks profitable but fails with real market impactModel slippage, fees, and latency in backtests