PDF Processor Skill

Quick Reference

Task	Method
Extract text	Run `scripts/extract_text.py`
Get form fields	Run `scripts/get_form_fields.py`
Fill form	Run `scripts/fill_form.py`
Convert to images	Run `scripts/pdf_to_images.py`

To extract text from a PDF:

bash

python scripts/extract_text.py input.pdf

This outputs the text content to stdout. For large PDFs, it processes page by page.

First, identify what fields exist:

bash

python scripts/get_form_fields.py form.pdf

Output is JSON with field names, types, and current values.

Create a JSON file with field values:

json

{
  "name": "John Doe",
  "email": "john@example.com",
  "date": "2024-01-15"
}

Then fill the form:

bash

python scripts/fill_form.py form.pdf values.json output.pdf

bash

python scripts/pdf_to_images.py input.pdf output_dir/

Creates one PNG per page.