AgentSkillsCN

benchmark-docs

跨多个平台管理基准测试文档。当您需要更新基准测试结果、补充性能数据,或记录 jq 对比基准测试时,这一技能将助您轻松完成任务。触发条件包括“benchmark”、“performance”、“jq 对比”、“基准测试结果”、“更新基准测试”等术语。

SKILL.md
--- frontmatter
name: benchmark-docs
description: Manages benchmark documentation across multiple platforms. Use when updating benchmark results, adding performance data, or documenting jq comparison benchmarks. Triggers on terms like "benchmark", "performance", "jq comparison", "benchmark results", "update benchmarks".

Benchmark Documentation Skill

This skill ensures proper handling of benchmark documentation across multiple platforms (ARM/Apple Silicon and x86_64/Intel/AMD).

For comprehensive benchmarking instructions, see docs/guides/benchmarking.md.

This skill focuses on documentation-specific rules and multi-platform considerations.

Critical Rules

NEVER replace platform-specific benchmarks with each other.

When adding new benchmark data:

  1. Keep both ARM and x86_64 benchmarks - Never delete one platform's data when adding another
  2. Add new platform data alongside existing data - Create separate sections or files per platform
  3. Name files with platform suffix - e.g., jq-comparison-m1.jsonl, jq-comparison-zen4.jsonl
  4. Update docs to show both - README and docs should reference benchmarks from all platforms

Benchmark File Locations

For complete inventory, see docs/benchmarks/inventory.md.

Key locations:

  • data/bench/generated/ - Input files (git-ignored)
  • data/bench/results/ - Output (git-ignored)
  • docs/benchmarks/*.md - Documentation (tracked)
  • README.md - Summary (tracked)

Updating Benchmark Documentation

For complete documentation update workflow, see docs/guides/benchmarking.md.

Quick Workflow

  1. Build the benchmark runner: cargo build --release --features bench-runner
  2. Run benchmarks: ./target/release/succinctly bench run jq_bench
  3. Review output: data/bench/results/<timestamp>/jq-bench.md
  4. Update documentation:
    • docs/benchmarks/jq.md - Full results with all patterns/sizes
    • README.md - Summary highlights only
  5. Follow platform parity rules (see below)

README Table Requirements

Platform Parity

Every benchmark table in README.md must have data for BOTH platforms:

  • x86_64 (AMD Ryzen 9 7950X or equivalent)
  • ARM (Apple M1 Max or equivalent)

If a table only shows one platform, add the missing platform's data from docs/benchmarks/jq.md.

Pattern Names

Never abbreviate pattern names in tables. Use full names:

  • pathological not patholog.
  • comprehensive not compreh.

Bold Formatting in Tables

When using bold for values in tables, ensure spaces are OUTSIDE the ** markers:

markdown
<!-- CORRECT -->
|  **59.6ms** |

<!-- WRONG (won't render as bold) -->
| ** 59.6ms** |

See the markdown-tables skill for complete table formatting rules.

Benchmark Patterns

PatternDescription
comprehensiveMixed content (realistic)
usersUser records (nested objects)
nestedDeep nesting (tests BP)
arraysLarge arrays (tests iteration)
stringsString-heavy (tests escapes)
unicodeUnicode strings
pathologicalWorst-case
numbersNumber-heavy documents
literalsMix of null, true, false
mixedHeterogeneous nested structures

Before Running Benchmarks

CRITICAL: Always compile before benchmarking:

bash
cargo build --release --features bench-runner

See docs/guides/benchmarking.md for environment setup and troubleshooting.

Quick checks:

  • Build is up to date: cargo build --release --features bench-runner
  • CPU load is low: uptime
  • Test data exists: ls data/bench/generated/

Benchmark Commands Reference

For complete command reference, see docs/guides/benchmarking.md.

IMPORTANT: Always use the unified benchmark runner (succinctly bench), not dev bench.

Running Benchmarks

Always build first, then run benchmarks:

bash
# Step 1: Build the benchmark runner (required before running benchmarks)
cargo build --release --features bench-runner

# Step 2: Generate test data (if not already present)
./target/release/succinctly json generate-suite
./target/release/succinctly yaml generate-suite

# Step 3: Run benchmarks using the unified runner
./target/release/succinctly bench run jq_bench      # JSON vs jq
./target/release/succinctly bench run yq_bench      # YAML vs yq
./target/release/succinctly bench run jq_comparison # Criterion JSON benchmarks
./target/release/succinctly bench run yq_comparison # Criterion YAML benchmarks

Listing Available Benchmarks

bash
./target/release/succinctly bench list

Running Multiple Benchmarks

bash
# Run all JSON benchmarks
./target/release/succinctly bench run jq_bench jq_comparison

# Run all YAML benchmarks
./target/release/succinctly bench run yq_bench yq_comparison yaml_bench

Memory Collection

Memory is collected by default for CLI benchmarks. Use --no-memory to skip.

Benchmark Types and Memory Support

TypeMemory CollectedHowExamples
CLIYes (default)/usr/bin/time peak RSSjq_bench, yq_bench, dsv_cli
CriterionNoIn-process timing onlyjq_comparison, yaml_bench
CrossParserNoIn-process timing onlyjson_parsers, yaml_parsers

Memory Flag Usage

bash
# Memory collected by default
./target/release/succinctly bench run yq_bench

# Skip memory collection (faster)
./target/release/succinctly bench run yq_bench --no-memory

# All CLI benchmarks support --no-memory
./target/release/succinctly bench run jq_bench --no-memory
./target/release/succinctly bench run dsv_cli --no-memory

Unified Runner Output

When running via succinctly bench run, CLI benchmark results are saved to the output directory:

code
data/bench/results/<timestamp>/
  metadata.json      # System info
  summary.json       # Run summary
  jq_bench.jsonl     # Raw results with peak_memory_bytes
  jq_bench.md        # Markdown with memory columns
  yq_bench.jsonl
  yq_bench.md
  stdout/
    jq_bench.txt     # Console output
    yq_bench.txt

yq Benchmark Query Types

The yq benchmark supports multiple query types to exercise different execution paths:

Query TypeExampleExecution PathDescription
identity.P9 streamingFull document streaming output
first_element.[0]M2 streamingNavigate to first array element
iteration.[]M2 streamingIterate over array elements
lengthlengthOwnedValueProduces computed value (not cursor-streamable)

Running yq Benchmarks

Always build first: cargo build --release --features bench-runner

bash
# Run yq CLI benchmark (memory is collected by default)
./target/release/succinctly bench run yq_bench

# Run specific query types
./target/release/succinctly bench run yq_bench --queries identity
./target/release/succinctly bench run yq_bench --queries identity,first_element

# Focus on M2 streaming with the navigation pattern
./target/release/succinctly bench run yq_bench --patterns navigation --sizes 10mb,100mb

# Skip memory collection (faster, but no memory comparison)
./target/release/succinctly bench run yq_bench --no-memory

Query Type Aliases

Multiple aliases are accepted for each query type:

Query TypeAliases
identityidentity, .
first_elementfirst_element, first, .[0]
iterationiteration, iter, .[]
lengthlength

Running bench-compare Benchmarks

For complete instructions, see docs/guides/benchmarking.md.

Quick steps:

  1. Generate data: cargo run --release --features cli -- json generate-suite
  2. Run: cd bench-compare && cargo bench --bench json_parsers
  3. Update documentation (see "Updating Documentation" section below)

Rust JSON Parser Comparison

The bench-compare/ subproject benchmarks succinctly against other Rust JSON parsers.

When to Use Each Parser

Use CaseBest ChoiceWhy
Full document traversalsonic-rsFastest parse + traverse (400+ MiB/s)
Selective field access (jq-style)succinctlyLazy evaluation skips unused data
Memory-constrained environmentssuccinctly17-45x less memory than DOM parsers
Standard DOM parsingserde_jsonBest ecosystem compatibility
SIMD-accelerated DOMsimd-jsonFast parsing, moderate memory

Key Learnings

  1. sonic-rs is fastest for full document access - ~670 MiB/s parse, ~420 MiB/s parse+traverse on ARM
  2. succinctly trades speed for memory - 46% of input size vs 8-21x for DOM parsers
  3. succinctly wins on selective queries - jq-style queries are 1.2-6.3x faster than jq because unused data isn't parsed
  4. simd-json has highest memory overhead - 20x input size due to tape-based representation

Performance Characteristics (ARM, Apple M1 Max)

ParserParse OnlyParse+TraverseMemory (100MB)
sonic-rs687 MiB/s425 MiB/s955 MB (11.9x)
succinctly534 MiB/s283 MiB/s37 MB (0.46x)
simd-json182 MiB/s227 MiB/s1654 MB (20.7x)
serde_json153 MiB/s139 MiB/s655 MB (8.2x)