Tax Document Analyzer
Analyze US tax documents, extract figures, calculate estimated federal/state tax liability, produce a professional Excel summary. Filing status: Married Filing Jointly.
Workflow
- •Discover documents — scan user's folder for tax files
- •Extract data — parse each document (parallel subagents)
- •Reconcile — cross-check, flag missing docs and discrepancies
- •Calculate — compute AGI, taxable income, tax owed
- •Produce output — Excel spreadsheet and/or chat summary
- •Verify — spot-check calculations programmatically
Step 1: Discover Documents
Scan folder recursively. Categorize by keywords:
| Pattern | Category |
|---|---|
w2, W-2 | Wage statements |
1099 | Interest, dividends, broker, misc, HSA |
3922 | ESPP transfers |
1098 | Mortgage interest |
childcare, dependent | Childcare statements |
rental, property, invoice, insurance, platform | Rental expenses |
File types: .pdf, .png, .jpg, .xlsx. Note unrecognizable files.
Step 2: Extract Data
Read references/document-parsing.md for detailed per-document extraction instructions.
Key principles:
- •Use parallel subagents (Task tool) to read multiple documents simultaneously
- •PDFs: try
Readtool first; if corrupt, trypdftotextvia Bash; if image-based, user must provide screenshot - •Screenshots/images: read directly with
Readtool (multimodal) - •Extract ALL box values, not just seemingly relevant ones
Critical extraction points:
| Document | Must Extract | Watch For |
|---|---|---|
| W-2 | Box 1, 2, 16, 17, Box 14 codes | ESPPDD/RESTK already in Box 1 |
| 1099-INT | Box 1, Box 4 | Multiple accounts — sum all |
| 1099-DIV | Box 1a, 1b, Box 4 | Qualified vs ordinary |
| 1099-B | Proceeds, basis, gain/loss, ST vs LT | Covered vs noncovered |
| Form 3922 | All 8 boxes per lot | Match lots to sales via FIFO |
| 1098 | Box 1, Box 10 | Rental vs primary residence |
| 1099-MISC | Box 1 | Rental platform income |
| 1099-SA | Box 1, Box 3 code | Code 1 = normal distribution |
Step 3: Reconcile
- •W-2 Box 14: If ESPPDD or RESTK present, confirm amounts are ALREADY in Box 1. Never add separately.
- •ESPP lots: Match Form 3922 lots to sales (FIFO). Calculate ordinary income + capital gain per lot. Compare total with W-2 Box 14 ESPPDD.
- •Rental expenses: Cross-check itemized list against individual receipts.
- •Interest/dividends: Sum across all brokerages.
- •Missing docs: Flag expected documents not found.
Step 4: Calculate
Read references/tax-rules.md for brackets, rates, and formulas.
AGI:
W-2 Box 1 (all) + Interest + Dividends + Net Capital Gains + Net Rental Income + Other - Above-the-line deductions = AGI
Tax liability:
AGI - Standard Deduction = Taxable Income Split into: ordinary income vs qualified dividends/LTCG Apply marginal brackets to ordinary → ordinary tax Apply preferential rates to qualified div/LTCG → preferential tax Total tax - Credits - Withholding = Amount owed/(refund)
If tax year ≠ 2025: web search for that year's standard deduction, brackets, and thresholds.
Step 5: Produce Output
Excel (when user wants a file)
Use scripts/generate_tax_summary.py. Construct the data dict per its docstring, then:
import sys; sys.path.insert(0, '<skill-path>/scripts') from generate_tax_summary import generate_tax_summary generate_tax_summary(data, output_path, tax_year=YEAR)
Then recalculate formulas using the xlsx skill's scripts/recalc.py.
Chat summary (always)
Total AGI: $X Federal Tax: $X Credits: -$X Withholding: -$X ──────────────────────── Estimated Owed: $X
Step 6: Verify
- •Recompute key totals in Python independent of spreadsheet
- •Check ESPP ordinary income matches W-2 Box 14 ESPPDD
- •Confirm no double-counting of W-2 Box 1 embedded income
Disclaimer
Always include: "I am not a tax professional. This is an estimate to help you plan. Please use tax software or consult a CPA for your actual filing."
Flag items NOT automatically calculated: depreciation, SALT deduction, charitable contributions, HSA/401(k) contributions, student loan interest.