AgentSkillsCN

processing-omr

处理OMR

SKILL.md

SKILL.md — Processing OMR

Capability

  • Convert detector outputs into MusicXML
  • Heuristics for stems/beams/accidentals association
  • Staff-line interpolation & page dewarp hints

Inputs

  • detections.json: detector outputs in YOLO-like format (x,y,w,h,cls,score)
  • image_path: page image used for detections

Outputs

  • out.musicxml: minimal, valid MusicXML (single-part baseline)

Agent steps (high level)

  1. Load detections and group by staff lines.
  2. Assign symbols to notes (heads, stems, beams, flags).
  3. Resolve durations and pitches; infer key and time when possible.
  4. Emit MusicXML with part/measure/note structure.
  5. Save intermediate debug overlays as PNGs if requested.

Note for Claude Code: this skill may call Python scripts in /tools via MCP Terminal.