Marker Document Converter
Convert PDF, EPUB, PPTX, DOCX, XLSX, HTML, and image files to clean Markdown/JSON/HTML format using the marker-pdf tool with multimodal LLM enhancement.
Prerequisites
# Install marker-pdf with full document support uv tool install marker-pdf[full]
Requires Python 3.10+ and PyTorch.
Basic Usage
marker_single "<file_path>" \ --output_format markdown \ --output_dir "<output_directory>" \ --use_llm \ --llm_service marker.services.claude.ClaudeService \ --claude_model_name claude-haiku-4-5 \ --claude_api_key $ANTHROPIC_API_KEY \ --disable_image_extraction
Note: --disable_image_extraction generates plain text output. Remove this flag if images need to be preserved.
Output Formats
| Format | Description | Use Case |
|---|---|---|
markdown | Formatted text with tables, LaTeX equations ($$-fenced), code blocks, image links | General document conversion |
html | Semantic HTML with <img>, <math>, <pre> tags | Web display |
json | Hierarchical structure with block types, bounding boxes, section hierarchy | Programmatic processing |
chunks | Flattened JSON optimized for RAG | Vector database ingestion |
CLI Options
Core Options
- •
--output_format:markdown(default),html,json,chunks - •
--output_dir: Directory for output files - •
--page_range: Specific pages, e.g.,"0,5-10,20"
LLM Enhancement
- •
--use_llm: Enable LLM for improved accuracy (tables, forms, math, handwriting) - •
--llm_service: LLM service class (see LLM Services below) - •
--block_correction_prompt: Custom prompt for output refinement
OCR & Processing
- •
--force_ocr: Force OCR on entire document, converts inline math to LaTeX - •
--strip_existing_ocr: Remove existing OCR and re-process - •
--redo_inline_math: Highest quality inline math conversion (use with--use_llm)
Image & Output Control
- •
--disable_image_extraction: Skip image extraction (plain text only) - •
--paginate_output: Add page separators to output - •
--extract_images: Enable image extraction (default: true)
Advanced
- •
--config_json: Load configuration from JSON file - •
--debug: Enable diagnostic logging - •
--force_layout_block: Force layout type, e.g.,Table - •
--converter_cls: Custom converter class
LLM Services
Claude (Default)
marker_single document.pdf \ --use_llm \ --llm_service marker.services.claude.ClaudeService \ --claude_api_key $ANTHROPIC_API_KEY \ --claude_model_name claude-haiku-4-5
OpenAI
marker_single document.pdf \ --use_llm \ --llm_service marker.services.openai.OpenAIService \ --openai_api_key $OPENAI_API_KEY \ --openai_model gpt-4o
Ollama (Local)
marker_single document.pdf \ --use_llm \ --llm_service marker.services.ollama.OllamaService \ --ollama_base_url "http://localhost:11434" \ --ollama_model llama3.2-vision
Google Gemini (Default if no service specified)
export GOOGLE_API_KEY="your-api-key" marker_single document.pdf --use_llm
Examples
Convert PDF to Markdown (Plain Text)
marker_single "./docs/report.pdf" \ --output_format markdown \ --output_dir "./docs/" \ --use_llm \ --llm_service marker.services.claude.ClaudeService \ --claude_model_name claude-haiku-4-5 \ --claude_api_key $ANTHROPIC_API_KEY \ --disable_image_extraction
Convert with Images Preserved
marker_single "./docs/report.pdf" \ --output_format markdown \ --output_dir "./docs/" \ --use_llm \ --llm_service marker.services.claude.ClaudeService \ --claude_model_name claude-haiku-4-5 \ --claude_api_key $ANTHROPIC_API_KEY
Extract Tables Only
marker_single "./docs/spreadsheet.pdf" \ --use_llm \ --force_layout_block Table \ --converter_cls marker.converters.table.TableConverter \ --output_format json
Batch Convert Multiple Files
marker /path/to/input/folder --workers 4
Using JSON Config File
cat > config.json << EOF
{
"force_ocr": true,
"use_llm": true,
"output_format": "markdown",
"disable_image_extraction": true,
"strip_existing_ocr": true,
"redo_inline_math": true
}
EOF
marker_single document.pdf --config_json config.json
Output Structure
Markdown Output
- •Image links:
 - •Tables: Formatted as markdown tables
- •Equations: Fenced with
$$...$$ - •Code: Fenced with
```language - •Headings:
#for sections
JSON Output
{
"pages": [
{
"id": "page_0",
"polygon": [[x1,y1], [x2,y2], ...],
"children": [
{
"id": "block_0",
"block_type": "Text|Table|Image|...",
"html": "<p>content</p>",
"polygon": [...],
"section_hierarchy": {...}
}
]
}
],
"metadata": {
"table_of_contents": [...],
"page_stats": [...]
}
}
Instructions
- •
Confirm the input file path exists
- •
Determine output directory (default: same as input file)
- •
Use AskUserQuestion tool to ask user preferences (ask both questions together):
Question 1 - Image Extraction:
- •Header: "Images"
- •Question: "是否需要提取文档中的图片?"
- •Options:
- •"No (Recommended)": 仅提取文本,生成纯 Markdown 文件
- •"Yes": 提取图片并保存,Markdown 中包含图片链接
Question 2 - LLM Service:
- •Header: "LLM"
- •Question: "使用哪个 LLM 来识别图片和表格内容?"
- •Options:
- •"Claude Haiku (Recommended)": 快速、经济,需要 ANTHROPIC_API_KEY
- •"Claude Sonnet": 更高质量,需要 ANTHROPIC_API_KEY
- •"GPT-4o": OpenAI 模型,需要 OPENAI_API_KEY
- •"Ollama (Local)": 本地运行,无需 API Key
- •
Based on user's answers, construct the command:
- •If "No" for images: add
--disable_image_extraction - •Set LLM service parameters according to selection:
- •Claude Haiku:
--llm_service marker.services.claude.ClaudeService --claude_model_name claude-haiku-4-5 --claude_api_key $ANTHROPIC_API_KEY - •Claude Sonnet:
--llm_service marker.services.claude.ClaudeService --claude_model_name claude-sonnet-4-20250514 --claude_api_key $ANTHROPIC_API_KEY - •GPT-4o:
--llm_service marker.services.openai.OpenAIService --openai_api_key $OPENAI_API_KEY --openai_model gpt-4o - •Ollama:
--llm_service marker.services.ollama.OllamaService --ollama_base_url "http://localhost:11434" --ollama_model llama3.2-vision
- •Claude Haiku:
- •If "No" for images: add
- •
Run the
marker_singlecommand with chosen options - •
Report the output file location and any extraction notes