PDF.co Integration
This skill provides behavioral guidance for interacting with the PDF.co MCP server.
Tool Overview
| Tool | Purpose | When to Use |
|---|---|---|
html_to_pdf | Convert HTML to PDF | Reports, quotes, invoices, certificates |
pdf_to_text | Extract text from PDF | Document analysis, content extraction |
merge_pdfs | Combine multiple PDFs | Consolidating documents |
split_pdf | Extract pages from PDF | Breaking apart documents |
add_watermark | Add watermark to PDF | Branding, draft marking |
html_to_pdf (Primary Tool)
Converts HTML content to a downloadable PDF document.
Parameters
| Parameter | Required | Default | Description |
|---|---|---|---|
html | Yes | - | HTML content (full document or fragment) |
name | No | "document.pdf" | Output filename |
margins | No | "10mm" | Page margins (CSS format) |
paperSize | No | "Letter" | Paper size: Letter, A4, Legal, Tabloid |
orientation | No | "Portrait" | Portrait or Landscape |
header | No | - | HTML for page header |
footer | No | - | HTML for page footer |
HTML Best Practices
CRITICAL: Use inline CSS. External stylesheets will not load.
Reliable HTML structure:
html
<html>
<head>
<style>
body { font-family: Arial, sans-serif; padding: 20px; }
h1 { color: #333; border-bottom: 2px solid #333; padding-bottom: 10px; }
table { width: 100%; border-collapse: collapse; margin: 20px 0; }
th, td { border: 1px solid #ddd; padding: 8px; text-align: left; }
th { background-color: #f5f5f5; font-weight: bold; }
.total { font-weight: bold; background-color: #f9f9f9; }
</style>
</head>
<body>
<h1>Document Title</h1>
<!-- Content here -->
</body>
</html>
What Works Well
- •Tables - Render reliably for invoices, quotes, data
- •Basic CSS - Colors, fonts, borders, padding, margins
- •Images - Use base64 data URLs or absolute URLs
- •Web-safe fonts - Arial, Helvetica, Times New Roman, Georgia
What May Have Issues
- •Flexbox/Grid - Complex layouts may not render as expected
- •External resources - Fonts, images from external URLs may fail
- •JavaScript - Not executed
- •CSS variables - May not be supported
Document Type Patterns
For Quotes/Invoices:
html
<table>
<thead>
<tr><th>Item</th><th>Qty</th><th>Price</th><th>Total</th></tr>
</thead>
<tbody>
<tr><td>Product A</td><td>2</td><td>$50</td><td>$100</td></tr>
</tbody>
<tfoot>
<tr class="total"><td colspan="3">Total</td><td>$100</td></tr>
</tfoot>
</table>
For Reports:
html
<h1>Monthly Report</h1> <h2>Summary</h2> <p>Key findings...</p> <h2>Data</h2> <table>...</table> <h2>Recommendations</h2> <ul><li>Action item 1</li></ul>
For Certificates:
html
<div style="text-align: center; padding: 40px;"> <h1 style="font-size: 36px;">Certificate of Completion</h1> <p style="font-size: 24px; margin: 40px 0;">This certifies that</p> <p style="font-size: 32px; font-weight: bold;">John Doe</p> <p style="margin-top: 40px;">has successfully completed the program.</p> </div>
pdf_to_text
Extracts text content from PDF documents.
Parameters
| Parameter | Required | Description |
|---|---|---|
url | Yes | URL to the PDF file |
pages | No | Page range: "1-3", "1,3,5", or "all" |
Best For
- •Text-based PDFs (not scanned images)
- •Extracting content for analysis
- •Converting PDF content to editable text
Limitations
- •Scanned documents need OCR (use
pdf_ocrinstead) - •Formatting is lost (tables become plain text)
- •Complex layouts may have jumbled text order
merge_pdfs
Combines multiple PDF files into a single document.
Parameters
| Parameter | Required | Description |
|---|---|---|
urls | Yes | Array of PDF URLs to merge |
name | No | Output filename |
Usage Notes
- •PDFs merge in the order provided
- •All page sizes are preserved
- •Bookmarks may not transfer
split_pdf
Extracts specific pages from a PDF.
Parameters
| Parameter | Required | Description |
|---|---|---|
url | Yes | URL to the PDF file |
pages | Yes | Pages to extract: "1-3", "1,3,5", "1-3,7-9" |
add_watermark
Adds text watermark to PDF pages.
Parameters
| Parameter | Required | Description |
|---|---|---|
url | Yes | URL to the PDF file |
text | Yes | Watermark text (e.g., "DRAFT", "CONFIDENTIAL") |
pages | No | Which pages (default: all) |
opacity | No | 0-1, lower = more transparent |
Error Recovery
"HTML parsing failed"
- •Check for unclosed tags
- •Ensure valid HTML structure
- •Remove problematic CSS
"Timeout"
- •HTML may be too complex
- •Simplify styles and structure
- •Split into multiple documents
"Invalid URL"
- •For pdf_to_text, merge, split: ensure URL is accessible
- •Use publicly accessible URLs or presigned URLs
Best Practices
- •Start simple - Basic HTML first, add styling incrementally
- •Test structure - Tables > divs for data layouts
- •Inline everything - CSS, small images as base64
- •Set paper size - Match intended print format
- •Use margins - Prevent content from touching edges