Markdown Converter
Convert files to useful Markdown using the installed markitdown and markit CLIs.
Route First
- •Use
markitdownfirst for.docx,.pptx,.xlsx,.xls, HTML, CSV, JSON, XML, images, audio, ZIP, YouTube, EPUB, and most non-PDF formats. - •Use
markitdownfirst for email-like, letter-like, or mostly linear-prose PDFs. - •Use
markit -qfirst for table-heavy, form-like, or multi-column PDFs where layout matters. - •Use
markitdown --use-pluginsfor scanned or image-heavy PDFs only when the environment already has a working OpenAI-compatible vision client/model configured for MarkItDown OCR. - •Fall back to plain
markitdownand say OCR is unavailable when that OCR configuration is missing.
Retry Or Compare
- •Do not run both tools by default.
- •Run the other tool when the first output is high-value and suspect, or when the user explicitly asks to compare.
- •Treat these as suspect: flattened tables, broken reading order, repeated headers or footers, near-empty output, clearly jumbled text, or giant
data:imageblocks. - •For DOCX, prefer
markitdownwhenmarkitemits base64-heavy Markdown.
Commands
bash
# Default DOCX / non-PDF path markitdown input.docx > output.md # Default prose-PDF path markitdown input.pdf > output.md # Layout-sensitive PDF path markit -q input.pdf > output.md # OCR path, only when OCR is configured markitdown --use-plugins input.pdf > output.md # Compare both on a PDF, then keep the better result markitdown input.pdf > /tmp/markitdown.md markit -q input.pdf > /tmp/markit.md
Output Rule
- •Return the chosen Markdown, not two full outputs.
- •If both tools were run, state which tool won and why in one short sentence.