AgentSkillsCN

Pdf

PDF

SKILL.md

PDF Document Skill

Comprehensive toolkit for PDF manipulation tasks including text/table extraction, document creation, merging, splitting, and form handling.

Trigger

  • When user needs PDF creation or manipulation
  • Text or table extraction from PDF documents
  • Merging or splitting PDF files
  • PDF form filling or processing
  • OCR processing of scanned documents

Core Capabilities

Text & Table Operations:

  • Extract text with layout preservation using pdfplumber
  • Automated table detection and conversion to data formats like Excel
  • OCR support for processing scanned documents

Document Manipulation:

  • Merge multiple PDFs
  • Split documents into individual pages
  • Rotate pages and add watermarks
  • Apply password encryption and decrypt protected files

PDF Creation:

  • Generate new documents from scratch using reportlab
  • Multi-page document support
  • Custom formatting and styling

Metadata & Forms:

  • Read and access document properties (title, author, subject)
  • Fill and process PDF forms
  • Access security and permissions

Primary Libraries

  • pypdf (basic operations)
  • pdfplumber (structured extraction)
  • reportlab (creation)
  • Command-line tools: qpdf, pdftotext

Use When

  • Creating professional PDF reports
  • Extracting data from PDF documents
  • Merging multiple documents
  • Processing scanned forms with OCR
  • Automating document workflows
  • Protecting documents with encryption