AgentSkillsCN

Pause Detector

利用本地动态阈值检测音频中的停顿与静默。

SKILL.md
--- frontmatter
category: Research
id: pause-detector
name: Pause Detector
description: Detect pauses and silence in audio using local dynamic thresholds.

Pause Detector

Detects pauses and low-energy segments using local dynamic thresholds on pre-computed energy data. Unlike global threshold methods, this compares energy to surrounding context, avoiding false positives when speaker volume varies.

Use Cases

  • Detecting natural pauses in lectures
  • Finding board-writing silences
  • Identifying breaks between sections

Usage

bash
python3 /root/.claude/skills/pause-detector/scripts/detect_pauses.py \
    --energies /path/to/energies.json \
    --output /path/to/pauses.json

Parameters

  • --energies: Path to energy JSON file (from energy-calculator)
  • --output: Path to output JSON file
  • --start-time: Start analyzing from this second (default: 0)
  • --threshold-ratio: Ratio of local average for low energy (default: 0.5)
  • --min-duration: Minimum pause duration in seconds (default: 2)
  • --window-size: Local average window size (default: 30)

Output Format

json
{
  "method": "local_dynamic_threshold",
  "segments": [
    {"start": 610, "end": 613, "duration": 3},
    {"start": 720, "end": 724, "duration": 4}
  ],
  "total_segments": 11,
  "total_duration_seconds": 28,
  "parameters": {
    "threshold_ratio": 0.5,
    "window_size": 30,
    "min_duration": 2
  }
}

How It Works

  1. Load pre-computed energy data
  2. Calculate local average using sliding window
  3. Mark seconds where energy < local_avg × threshold_ratio
  4. Group consecutive low-energy seconds into segments
  5. Filter segments by minimum duration

Dependencies

  • Python 3.11+
  • numpy
  • scipy

Example

bash
# Detect pauses after opening (start at 221s)
python3 /root/.claude/skills/pause-detector/scripts/detect_pauses.py \
    --energies energies.json \
    --start-time 221 \
    --output pauses.json

# Result: Found 11 pauses totaling 28 seconds

Parameters Tuning

ParameterLower ValueHigher Value
threshold_ratioMore aggressiveMore conservative
min_durationShorter pausesLonger pauses only
window_sizeLocal contextBroader context

Notes

  • Requires energy data from energy-calculator skill
  • Use --start-time to skip detected opening
  • Local threshold adapts to varying speaker volume