Benchmark Runner Workflow

Multi-step workflow for running performance benchmarks on ccswarm.

Overview

This skill guides you through running, analyzing, and comparing performance benchmarks for the ccswarm system.

Benchmark Types

1. Microbenchmarks

Fine-grained performance measurements for specific functions.

2. Integration Benchmarks

End-to-end performance for complete workflows.

3. Load Testing

System behavior under sustained load.

Running Benchmarks

1. Setup

bash

# Ensure criterion is available
cargo install cargo-criterion

# Build in release mode first
cargo build --release --workspace

2. Run All Benchmarks

bash

# Run with criterion
cargo criterion --workspace

# Or standard bench
cargo bench --workspace

3. Run Specific Benchmarks

bash

# By crate
cargo bench -p ccswarm

# By benchmark name
cargo bench --bench session_benchmark

# By filter
cargo bench -- "orchestrator"

Benchmark Configuration

Criterion Config

rust

// benches/bench.rs
use criterion::{criterion_group, criterion_main, Criterion};

fn benchmark_task_delegation(c: &mut Criterion) {
    c.bench_function("delegate_task", |b| {
        b.iter(|| {
            // Benchmark code
        })
    });
}

criterion_group!(benches, benchmark_task_delegation);
criterion_main!(benches);

Cargo.toml

toml

[[bench]]
name = "orchestrator_bench"
harness = false

Analysis

1. View Results

bash

# Results are in target/criterion/
open target/criterion/report/index.html

2. Compare with Baseline

bash

# Save baseline
cargo bench -- --save-baseline before

# Make changes, then compare
cargo bench -- --baseline before

3. Profiling

bash

# With flamegraph
cargo flamegraph --bench orchestrator_bench

# With perf (Linux)
perf record cargo bench --bench orchestrator_bench
perf report

Key Metrics

Session Management

•Session creation time
•Context compression ratio
•Message throughput

Orchestration

•Task delegation latency
•Agent assignment time
•Consensus round time

Memory

•Peak memory usage
•Memory per agent
•Context size

Benchmark Checklist

CI Integration

GitHub Actions

yaml

benchmark:
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - uses: dtolnay/rust-toolchain@stable
    - run: cargo bench -- --save-baseline ci
    - uses: actions/upload-artifact@v4
      with:
        name: benchmark-results
        path: target/criterion/

Interpreting Results

Good Performance

•< 1ms for task delegation
•< 100ms for full workflow
•< 10MB memory per agent

Warning Signs

•High variance (> 20%)
•Memory growth over time
•Degradation vs baseline

Optimization Workflow

•Run baseline benchmark
•Profile to identify hotspots
•Implement optimization
•Run comparison benchmark
•Document improvements