Overview
Never hardcode calculations. Always use Excel formulas. This keeps spreadsheets dynamic and updateable.
Implementation Checklist
- • Clarify objectives - What insights or outputs does partner need?
- • Examine structure - Use Read tool to understand file layout
- • Explore data - Use pandas for initial exploration:
python import pandas as pd df = pd.read_excel('file.xlsx') df.info() df.describe() df.head() - • Choose approach: - pandas: For data analysis, transformations, simple exports - openpyxl: For formulas, formatting, preserving Excel features
- • Build analysis - Use formulas, not hardcoded values:
- ✅
sheet['B10'] = '=SUM(B2:B9)'- ❌sheet['B10'] = sum(values) - • Apply formatting if creating output: - Blue text: Inputs users will change - Black text: Formulas and calculations - Yellow background: Key assumptions
- • Recalculate formulas (if using openpyxl):
bash python ${HOME}/.config/agent/skills/skills/tools/xlsx/recalc.py output.xlsx - • Verify zero errors - Check for #REF!, #DIV/0!, #VALUE!, etc.
- • Test edge cases - Try with zeros, negatives, very large values
- • Document assumptions - Add comments to cells with key formulas
Quality Standards
Zero Tolerance for Formula Errors
Every Excel file MUST be delivered with ZERO formula errors:
- •No #REF! (invalid references)
- •No #DIV/0! (division by zero)
- •No #VALUE! (wrong data type)
- •No #N/A (lookup failures)
- •No #NAME? (unrecognized function)
Formula Construction
Use cell references, not hardcoded values:
# BAD - Hardcodes growth rate sheet['C5'] = revenue * 1.15 # GOOD - References assumption cell sheet['C5'] = '=C4*(1+$B$2)' # Where B2 contains growth rate
Check denominators before division:
# BAD - Might divide by zero sheet['D5'] = '=C5/C4' # GOOD - Handles zero case sheet['D5'] = '=IF(C4=0,"-",C5/C4)'
Data Integrity
- •Verify all cell references point to intended cells
- •Check for off-by-one errors in ranges
- •Ensure consistent formulas across projection periods
- •Test with edge cases before declaring complete
Workflow Patterns
Pattern 1: Quick Data Analysis
For exploration and simple exports:
import pandas as pd
# Read and analyze
df = pd.read_excel('input.xlsx')
df.describe()
df.groupby('category')['sales'].sum()
# Export results
df.to_excel('analysis.xlsx', index=False)
Pattern 2: Creating Excel Files with Formulas
For dynamic spreadsheets:
from openpyxl import Workbook
from openpyxl.styles import Font, PatternFill
wb = Workbook()
sheet = wb.active
# Add data and formulas
sheet['A1'] = 'Revenue'
sheet['B1'] = 'Growth Rate'
sheet['B2'] = 0.15 # Input (blue)
sheet['A3'] = 'Year 1'
sheet['B3'] = 1000
sheet['A4'] = 'Year 2'
sheet['B4'] = '=B3*(1+$B$2)' # Formula (black)
# Format
sheet['B2'].font = Font(color='0000FF') # Blue for input
sheet['B4'].font = Font(color='000000') # Black for formula
wb.save('model.xlsx')
# MUST recalculate
import subprocess
subprocess.run(['python', 'recalc.py', 'model.xlsx'])
Pattern 3: Editing Existing Files
Preserve existing structure:
from openpyxl import load_workbook
# Load without losing formulas
wb = load_workbook('existing.xlsx')
sheet = wb['SheetName']
# Modify cells
sheet['A1'] = 'Updated Value'
sheet['B5'] = '=SUM(B2:B4)' # Add formula
# Save and recalculate
wb.save('updated.xlsx')
subprocess.run(['python', 'recalc.py', 'updated.xlsx'])
Common Pitfalls
❌ Using data_only=True and saving - This replaces formulas with values permanently ❌ Forgetting to recalculate - openpyxl doesn't evaluate formulas ❌ Hardcoding calculated values - Makes spreadsheet static ❌ Not checking for #DIV/0! - Test edge cases ❌ Wrong cell references - Off-by-one errors are common ❌ Mixing 0-based and 1-based indexing - pandas is 0-based, Excel is 1-based
Anti-Patterns
❌ Calculating in Python, hardcoding result:
total = df['Sales'].sum() # Calculates in Python sheet['B10'] = total # Hardcodes 5000
✅ Using Excel formula:
sheet['B10'] = '=SUM(B2:B9)' # Excel calculates dynamically
❌ Not verifying formula errors:
wb.save('output.xlsx') # Done, right?
✅ Always verify:
wb.save('output.xlsx')
result = subprocess.run(['python', 'recalc.py', 'output.xlsx'],
capture_output=True, text=True)
errors = json.loads(result.stdout)
if errors['status'] == 'errors_found':
print(f"Errors: {errors['error_summary']}")
# Fix errors and retry
Tools
Python Libraries
- •pandas - Data analysis, reading/writing Excel
- •openpyxl - Creating/editing with formulas and formatting
- •xlsxwriter - Alternative for writing (no editing existing files)
Scripts
- •recalc.py - Recalculates all formulas using LibreOffice
Returns JSON with error details and countsbash
python ~/.config/agent/skills/skills/tools/xlsx/recalc.py file.xlsx
Supporting
For detailed tool documentation and reference, see:
- •
skills/tools/xlsx/- Comprehensive xlsx manipulation guide