MD Section Splitter
Overview
Split normalized Markdown into H2 sections while preserving code fences and output an index for downstream processing.
Quick Start
bash
# list sources and their last normalized path skills/md-section-splitter/scripts/md_section_splitter.rb list # split all sources in state.json skills/md-section-splitter/scripts/md_section_splitter.rb split --all # split a single source by id skills/md-section-splitter/scripts/md_section_splitter.rb split --id best-practices
Inputs
- •
data/doc-fetcher/state.json(last normalized path per source) - •Normalized Markdown under
data/doc-fetcher/normalized/<id>/ - •Subcommands:
list,split - •Optional flags for
split:--all,--id,--force,--dry-run
Outputs
- •Sections:
data/doc-fetcher/sections/<id>/<snapshot_sha>/ - •Index:
data/doc-fetcher/sections/<id>/<snapshot_sha>/index.json - •State updates:
data/doc-fetcher/state.json
Workflow
- •Ensure normalization has run (md-normalizer).
- •Run the splitter with
split --allorsplit --id. - •Confirm section files and
index.jsonexist.
Options
- •
list: Print sources and last normalized path (from state.json). - •
split --force: Overwrite existing section output. - •
split --dry-run: Do not write files.
Notes
- •H2 is treated as a section boundary; code fences are respected.
- •A wrapper exists at
scripts/md_section_splitter.rb.