Model Usage Tracking
Track LLM usage, token consumption, and estimate costs.
Monitoring
Token Usage
Track tokens used in conversations:
- •Input tokens
- •Output tokens
- •Total tokens
Cost Estimation
Anthropic Pricing (as of 2026)
- •Claude Opus 4.5: $15 per MTok input, $75 per MTok output
- •Claude Sonnet 3.5: $3 per MTok input, $15 per MTok output
OpenAI Pricing
- •GPT-4 Turbo: $10 per MTok input, $30 per MTok output
- •GPT-4: $30 per MTok input, $60 per MTok output
Usage Examples
User: "How many tokens have I used today?"
- •Query session history
- •Sum token counts
- •Present totals
User: "Estimate my costs for this month"
- •Gather usage statistics
- •Calculate costs per model
- •Present breakdown
User: "Which model is most efficient for my use case?"
- •Compare token usage patterns
- •Calculate cost per conversation
- •Recommend most efficient model
Best Practices
- •Log token usage with each API call
- •Store usage data in session metadata
- •Provide cost estimates before expensive operations
- •Suggest model switching for cost optimization
Data Structure
python
usage = {
"model": "claude-opus-4-5",
"input_tokens": 1500,
"output_tokens": 500,
"cost_usd": 0.06, # Estimated
"timestamp": "2026-01-27T00:00:00Z"
}
Reporting
Generate usage reports:
- •Daily/weekly/monthly summaries
- •Per-model breakdown
- •Cost trends
- •Token efficiency metrics