model-usage

Model Usage Tracking

Track LLM usage, token consumption, and estimate costs.

Monitoring

Token Usage

Track tokens used in conversations:

•Input tokens
•Output tokens
•Total tokens

Cost Estimation

Anthropic Pricing (as of 2026)

•Claude Opus 4.5: $15 per MTok input, $75 per MTok output
•Claude Sonnet 3.5: $3 per MTok input, $15 per MTok output

OpenAI Pricing

•GPT-4 Turbo: $10 per MTok input, $30 per MTok output
•GPT-4: $30 per MTok input, $60 per MTok output

Usage Examples

User: "How many tokens have I used today?"

•Query session history
•Sum token counts
•Present totals

User: "Estimate my costs for this month"

•Gather usage statistics
•Calculate costs per model
•Present breakdown

User: "Which model is most efficient for my use case?"

•Compare token usage patterns
•Calculate cost per conversation
•Recommend most efficient model

Best Practices

•Log token usage with each API call
•Store usage data in session metadata
•Provide cost estimates before expensive operations
•Suggest model switching for cost optimization

Data Structure

python

usage = {
    "model": "claude-opus-4-5",
    "input_tokens": 1500,
    "output_tokens": 500,
    "cost_usd": 0.06,  # Estimated
    "timestamp": "2026-01-27T00:00:00Z"
}

Reporting

Generate usage reports:

•Daily/weekly/monthly summaries
•Per-model breakdown
•Cost trends
•Token efficiency metrics