AgentSkillsCN

ruvector-burst-scaling

RuVector 的自适应突发扩容功能,可通过自动调配与背压机制应对 10–50 倍的流量激增。适用于构建需灵活应对突发负载增长的系统、为向量搜索实现自动伸缩,或在服务中加入熔断机制与背压控制。

SKILL.md
--- frontmatter
name: ruvector-burst-scaling
description: "Adaptive burst scaling for RuVector that handles 10-50x traffic spikes with auto-provisioning and backpressure. Use when building systems that need to absorb sudden load increases, implementing auto-scaling for vector search, or adding circuit breakers and backpressure to services."

@ruvector/burst-scaling

Adaptive burst scaling system for RuVector deployments. Automatically handles 10-50x traffic spikes through request queuing, worker pool scaling, circuit breakers, and backpressure mechanisms.

Quick Reference

TaskCode
Installnpx @ruvector/burst-scaling@latest
Create scalernew BurstScaler(config)
Wrap handlerscaler.wrap(handler)
Get metricsscaler.metrics()
Set limitsscaler.setLimits(config)
Circuit breakernew CircuitBreaker(config)

Installation

bash
npx @ruvector/burst-scaling@latest

Quick Start

typescript
import {
  BurstScaler,
  CircuitBreaker,
  BackpressureQueue,
} from '@ruvector/burst-scaling';

// Wrap any async handler with burst scaling
const scaler = new BurstScaler({
  baseWorkers: 4,
  maxWorkers: 64,
  scaleUpThreshold: 0.8,    // Scale at 80% utilization
  scaleDownThreshold: 0.2,  // Scale down at 20%
  maxQueueSize: 10_000,
  targetLatencyMs: 50,
});

// Wrap a search handler
const scaledSearch = scaler.wrap(async (query: Float32Array) => {
  return index.search(query, 10);
});

// Handle requests - automatically scales under load
const results = await scaledSearch(queryVector);

// Monitor scaling behavior
const metrics = scaler.metrics();
console.log(`Workers: ${metrics.activeWorkers}/${metrics.maxWorkers}`);
console.log(`Queue depth: ${metrics.queueDepth}`);
console.log(`P99 latency: ${metrics.p99LatencyMs}ms`);

Core API

BurstScaler

Main auto-scaling controller.

typescript
const scaler = new BurstScaler(config: ScalerConfig);

ScalerConfig:

ParameterTypeDefaultDescription
baseWorkersnumber4Minimum worker count
maxWorkersnumber64Maximum worker count
scaleUpThresholdnumber0.8Utilization trigger to scale up
scaleDownThresholdnumber0.2Utilization trigger to scale down
scaleUpStepnumber2Workers added per scale-up
scaleDownStepnumber1Workers removed per scale-down
cooldownMsnumber5000Min time between scaling events
maxQueueSizenumber10000Request queue capacity
targetLatencyMsnumber100Target P99 latency
drainTimeoutMsnumber30000Graceful shutdown timeout

scaler.wrap(handler)

Wrap an async function with automatic scaling.

typescript
scaler.wrap<T, R>(handler: (input: T) => Promise<R>): (input: T) => Promise<R>

scaler.wrapBatch(handler)

Wrap a batch handler that processes arrays.

typescript
scaler.wrapBatch<T, R>(handler: (inputs: T[]) => Promise<R[]>): (input: T) => Promise<R>

Automatically batches individual requests for efficiency.

scaler.metrics()

Get current scaling metrics.

typescript
scaler.metrics(): ScalerMetrics

ScalerMetrics:

FieldTypeDescription
activeWorkersnumberCurrent worker count
maxWorkersnumberConfigured max
queueDepthnumberPending requests
utilizationnumberWorker utilization (0-1)
p50LatencyMsnumberMedian latency
p99LatencyMsnumber99th percentile latency
requestsPerSecnumberCurrent throughput
totalRequestsnumberLifetime count
totalErrorsnumberLifetime errors
scaleEventsnumberTotal scale up/down events

scaler.setLimits(config)

Dynamically update scaling limits.

typescript
scaler.setLimits(config: Partial<ScalerConfig>): void

scaler.pause() / scaler.resume()

typescript
scaler.pause(): void   // Stop accepting new requests
scaler.resume(): void  // Resume accepting requests

scaler.shutdown()

Graceful shutdown: drain queue, stop workers.

typescript
await scaler.shutdown(): Promise<void>

CircuitBreaker

Prevent cascading failures under sustained load.

typescript
const breaker = new CircuitBreaker(config: BreakerConfig);

BreakerConfig:

ParameterTypeDefaultDescription
failureThresholdnumber5Failures before opening
resetTimeoutMsnumber30000Time before half-open
halfOpenRequestsnumber3Test requests in half-open
monitorWindowMsnumber60000Failure tracking window
typescript
const protected = breaker.wrap(riskyFn);
const result = await protected(input);

breaker.state: 'closed' | 'open' | 'half-open'
breaker.reset(): void
breaker.on('open', () => { /* alert */ })
breaker.on('close', () => { /* recovered */ })

BackpressureQueue

Request queue with configurable backpressure strategies.

typescript
const queue = new BackpressureQueue(config: QueueConfig);

QueueConfig:

ParameterTypeDefaultDescription
maxSizenumber10000Queue capacity
strategy'drop-newest' | 'drop-oldest' | 'reject''reject'Overflow strategy
priorityFn(req) => number-Priority function
timeoutMsnumber30000Request timeout
typescript
queue.enqueue(request): Promise<void>
queue.size(): number
queue.isFull(): boolean
queue.drain(): Promise<void>

Common Patterns

Auto-Scaling Vector Search

typescript
const scaler = new BurstScaler({ baseWorkers: 4, maxWorkers: 32 });
const search = scaler.wrap(async (q: Float32Array) => index.search(q, 10));

// Handle burst traffic
app.post('/search', async (req, res) => {
  const results = await search(req.body.query);
  res.json(results);
});

Circuit Breaker with Fallback

typescript
const breaker = new CircuitBreaker({ failureThreshold: 3 });
const protectedSearch = breaker.wrap(remoteSearch);

async function searchWithFallback(query) {
  try {
    return await protectedSearch(query);
  } catch (e) {
    return localCache.search(query); // Fallback
  }
}

Events

typescript
scaler.on('scale:up', (from, to) => console.log(`Scaled ${from} -> ${to} workers`));
scaler.on('scale:down', (from, to) => console.log(`Scaled ${from} -> ${to} workers`));
scaler.on('queue:full', () => console.log('Queue at capacity'));
scaler.on('request:timeout', (req) => console.log('Request timed out'));

References