AgentSkillsCN

temporal-neural-solver

采用亚微秒级延迟的超高速 WASM 神经推理引擎,适用于边缘端与浏览器端的部署。适用于在 WebAssembly 中运行神经网络推理、将模型部署至边缘设备、在浏览器中进行实时预测,或在对延迟敏感的应用中实现极低开销的推理。

SKILL.md
--- frontmatter
name: "temporal-neural-solver"
description: "Ultra-fast WASM neural inference engine with sub-microsecond latency for edge and browser deployments. Use when running neural network inference in WebAssembly, deploying models to edge devices, performing real-time prediction in browsers, or needing minimal-overhead inference for latency-critical applications."

temporal-neural-solver

Ultra-fast neural network inference engine compiled to WebAssembly, achieving sub-microsecond latency for edge, browser, and serverless deployments with minimal memory overhead.

Quick Reference

TaskCode
Installnpx temporal-neural-solver@latest
Importimport { TemporalSolver } from 'temporal-neural-solver';
Createconst solver = new TemporalSolver();
Load modelawait solver.loadModel(modelPath);
Inferconst result = await solver.solve(problem);
Benchmarkconst perf = await solver.benchmark();

Installation

Install: npx temporal-neural-solver@latest See Installation Guide for the full ecosystem.

Key API

TemporalSolver

The main WASM-accelerated neural inference engine.

typescript
import { TemporalSolver } from 'temporal-neural-solver';

const solver = new TemporalSolver({
  backend: 'wasm',
  threads: 4,
  quantization: 'int8',
});

Constructor Options:

OptionTypeDefaultDescription
backendstring'wasm'Backend: 'wasm', 'napi', 'js'
threadsnumber1Worker threads for WASM
quantizationstring'none'Quantization: 'none', 'int8', 'int4', 'fp16'
cacheModelsbooleantrueCache loaded models
maxMemoryMBnumber256Maximum memory usage
simdbooleantrueEnable WASM SIMD

Methods:

MethodReturnsDescription
solve(problem)Promise<SolveResult>Run neural inference
solveBatch(problems)Promise<SolveResult[]>Batch inference
loadModel(path)Promise<void>Load ONNX/GGUF model
loadModelFromBuffer(buf)Promise<void>Load model from buffer
benchmark(opts?)Promise<BenchmarkResult>Run performance benchmark
getModelInfo()ModelInfoLoaded model information
warmup(iterations?)Promise<void>Warm up inference pipeline
dispose()voidFree WASM memory

InferenceSession

Low-level inference session for fine-grained control.

typescript
import { InferenceSession } from 'temporal-neural-solver';

const session = new InferenceSession({
  model: modelBuffer,
  executionProviders: ['wasm'],
});

const output = await session.run({ input: inputTensor });

Methods:

MethodReturnsDescription
run(feeds)Promise<OutputMap>Run inference with named inputs
getInputNames()string[]Get model input names
getOutputNames()string[]Get model output names
getMetadata()ModelMetadataGet model metadata

Quantizer

Model quantization for size and speed optimization.

typescript
import { Quantizer } from 'temporal-neural-solver';

const quantizer = new Quantizer({ method: 'int8', calibrationData: data });
const quantized = await quantizer.quantize(modelBuffer);

Constructor Options:

OptionTypeDefaultDescription
methodstring'int8'Quantization: 'int8', 'int4', 'fp16', 'dynamic'
calibrationDataFloat32Array[]undefinedCalibration data for static quant
perChannelbooleantruePer-channel quantization

Common Patterns

Edge Neural Inference

typescript
import { TemporalSolver } from 'temporal-neural-solver';

const solver = new TemporalSolver({ backend: 'wasm', quantization: 'int8' });
await solver.loadModel('./model.onnx');
await solver.warmup(10);

const result = await solver.solve({ input: sensorData });
console.log(`Prediction: ${result.output}, Latency: ${result.latencyUs}us`);

Browser Deployment

typescript
import { TemporalSolver } from 'temporal-neural-solver';

const solver = new TemporalSolver({ backend: 'wasm', simd: true });

const response = await fetch('/models/classifier.onnx');
const buffer = await response.arrayBuffer();
await solver.loadModelFromBuffer(new Uint8Array(buffer));

const prediction = await solver.solve({ image: imageData });

Batch Processing Pipeline

typescript
import { TemporalSolver } from 'temporal-neural-solver';

const solver = new TemporalSolver({ threads: 4 });
await solver.loadModel('./model.onnx');

const results = await solver.solveBatch(
  inputs.map(input => ({ input }))
);
console.log(`Avg latency: ${results.reduce((a, r) => a + r.latencyUs, 0) / results.length}us`);

RAN DDD Context

Bounded Context: RANO Optimization

References