MD Normalizer
Overview
Normalize snapshot content (Markdown or HTML) into stable GitHub-flavored Markdown and write metadata/state updates.
Quick Start
bash
# list sources and their last snapshot path skills/md-normalizer/scripts/md_normalizer.rb list # normalize all sources in state.json skills/md-normalizer/scripts/md_normalizer.rb normalize --all # normalize a single source by id skills/md-normalizer/scripts/md_normalizer.rb normalize --id best-practices
Inputs
- •
data/doc-fetcher/state.json(last snapshot path per source) - •Snapshot files under
data/doc-fetcher/snapshots/<id>/ - •Subcommands:
list,normalize - •Optional flags for
normalize:--all,--id,--force,--dry-run
Outputs
- •Normalized Markdown:
data/doc-fetcher/normalized/<id>/<sha>.md - •Normalization metadata:
data/doc-fetcher/normalized/<id>/<sha>.json - •State updates:
data/doc-fetcher/state.json
Workflow
- •Ensure snapshots exist (run doc-fetcher first).
- •Run normalization with
normalize --allornormalize --id. - •Confirm normalized output and state updates.
Options
- •
list: Print sources and last snapshot path (from state.json). - •
normalize --force: Overwrite existing normalized output. - •
normalize --dry-run: Do not write files.
Notes
- •Prefers
.mdsnapshot passthrough; usespandocfor HTML. - •A wrapper exists at
scripts/md_normalizer.rb.