Document to Text Converter
Overview
This skill acts as a universal converter to extract plain text and structured data from various binary and complex file formats. It enables Gemini to "read" files that are otherwise inaccessible.
Capabilities
1. Document Extraction
- •PDF (
.pdf): Extracts plain text. - •Excel (
.xlsx): Converts sheets to CSV and performs OCR on embedded images. - •Word (
.docx): Extracts text and performs OCR on embedded images. - •PowerPoint (
.pptx): Extracts slide text and performs OCR on embedded images.
2. Image OCR
- •Images (
.png,.jpg,.jpeg,.webp): Uses Tesseract.js to perform OCR (Optical Character Recognition) and extract text from images. Supports English and Japanese.
3. Data & Archives
- •Email (
.eml): Parses headers (From, To, Subject) and body text. - •ZIP Archive (
.zip): Lists files and extracts content of text-based files within the archive without extracting to disk.
Usage
To read a file, execute the extract.cjs script with the file path.
bash
node scripts/extract.cjs <path/to/file>
Example:
User: "What does the error screenshot say?"
Action: node scripts/extract.cjs error.png
Dependencies
This skill requires Node.js packages.
Run npm install in the skill directory before using.
Knowledge Protocol
- •This skill adheres to the
knowledge/orchestration/knowledge-protocol.md. It automatically integrates Public, Confidential (Company/Client), and Personal knowledge tiers, prioritizing the most specific secrets while ensuring no leaks to public outputs.