Instructions
1. Initialize and Explore
- •Explore Workspace: List the contents of the
/workspace/dumps/workspacedirectory to locate thefilesfolder containing PDF receipts. - •Explore Database: List schemas in the
PURCHASE_INVOICEdatabase, then list tables in thePUBLICschema. - •Examine Tables: Describe the structure of the
INVOICESandINVOICE_PAYMENTStables. Read sample data to understand existing records and column formats.
2. Extract Data from PDF Receipts
- •Scan Files: List all PDF files in the workspace's
filesdirectory. - •Read PDFs: For each PDF file, extract text from the first page. Parse the following key fields:
- •Invoice ID: Look for patterns like "Invoice:", "Invoice ID:", "Document ID:", or text near the filename.
- •Supplier Name: Typically under "Supplier:", "Vendor:", or "VENDOR INFORMATION:".
- •Invoice Amount: Look for "Amount:", "Total Amount:", or "Total:" followed by a dollar amount.
- •Purchaser Email: Look for "Contact:", "Bill To:", or "DEPARTMENT INFORMATION:" followed by an email address.
- •Invoice Date: Look for "Date:", "Processing Date:", or "Invoice Date:".
- •Payment Status: Identify status keywords:
- •
PAIDor checkmark (✓): Fully paid. - •
PARTIALor square (■): Partially paid. Extract the paid amount if available. - •
UNPAID,Awaiting payment,Pending,Verification in process: Outstanding (unpaid). - •Default to outstanding if status is unclear.
- •
3. Update Database Tables
- •Insert Invoices: For each extracted receipt, insert a record into
PURCHASE_INVOICE.PUBLIC.INVOICESwith columns:INVOICE_ID,SUPPLIER_NAME,INVOICE_AMOUNT,PURCHASER_EMAIL,INVOICE_DATE. - •Insert Payments: For each invoice, insert a corresponding record into
PURCHASE_INVOICE.PUBLIC.INVOICE_PAYMENTS:- •
INVOICE_ID: The extracted invoice ID. - •
PAYMENT_AMOUNT:- •If status is
PAID: Set to the full invoice amount. - •If status is
PARTIAL: Set to the extracted paid amount. - •Otherwise: Set to
0.
- •If status is
- •
OUTSTANDING_FLAG:- •
0if status isPAID. - •
1for all other statuses (PARTIAL,UNPAID,Awaiting,Pending,Verification).
- •
- •
- •Set Column Description: Execute an
ALTER TABLEstatement to set the comment/description for theOUTSTANDING_FLAGcolumn inINVOICE_PAYMENTSto exactly:0=Paid, 1=Outstanding. Verify the description was applied.
4. Identify and Notify on Outstanding Invoices
- •Query Outstanding Invoices: Perform a
JOINbetweenINVOICESandINVOICE_PAYMENTSwhereOUTSTANDING_FLAG = 1. Group results byPURCHASER_EMAIL. - •Map to Filenames: For each outstanding invoice, map the
INVOICE_IDback to the original PDF filename (e.g.,INV-2024-013→INV-2024-013.pdf). - •Send Emails: For each unique purchasing manager email (
PURCHASER_EMAIL):- •Subject:
Process Outstanding Invoices - •Body: List the filenames of all outstanding invoices for that manager. Use a clear, professional format.
- •Send the email.
- •Subject:
5. Final Verification and Summary
- •Provide a concise summary report including:
- •Count of receipts processed and inserted.
- •Count of outstanding vs. paid invoices.
- •List of managers notified and the number of files each needs to process.
- •Confirmation that the column description was set.