Token Budget Monitor

Purpose

Prevent getting caught at token/conversation limits by monitoring usage and warning proactively. Especially critical during long, tool-heavy conversations.

When to Activate

This skill should load early in conversations and check periodically throughout, especially:

•Long conversations (>50k tokens)
•Heavy tool usage (MCP calls, searches, file operations)
•Multi-hour sessions
•When user mentions "running out of tokens before"

Current Limits (as of October 2025)

Message Limits

•Per conversation: Varies by plan
•Daily: Varies by plan
•Weekly: Varies by plan

Token Limits

•Per message: 190,000 tokens (context window)
•Conversation accumulation: No hard limit, but performance degrades

Monitoring Strategy

Check Points

Claude should check token usage at these intervals:

Every 50k tokens:

•Calculate: Current usage / Time elapsed
•Predict: Time until limit based on pace
•Warn if: Will hit limit before next natural break

Critical Thresholds:

•🟡 70% of limit: "Heads up - conversation getting long"
•🟠 85% of limit: "Warning - approaching limit soon"
•🔴 95% of limit: "URGENT - hitting limit very soon!"

Usage Pattern Analysis

Track These Metrics:

code

conversation_start_time
tokens_used_so_far
tools_called_count
time_elapsed_hours

Calculate:

code

tokens_per_hour = tokens_used / time_elapsed
tools_per_hour = tools_called / time_elapsed

# Predict time to limit
remaining_tokens = limit - tokens_used
hours_until_limit = remaining_tokens / tokens_per_hour

Warning Message Templates

Early Warning (70%)

code

💡 Token Usage Check:
You've used ~X tokens so far (70% of context).
At current pace, you have ~X hours before hitting limits.
Consider: Save important context, prepare to wrap up, or start fresh chat.

Urgent Warning (85%)

code

⚠️ Token Budget Alert:
You're at 85% of context window.
Estimated time remaining: ~X hours at current pace.
Action: Finish critical tasks, save context, prepare new conversation.

Critical Warning (95%)

code

🚨 URGENT - Token Limit Imminent:
You're at 95% of context! 
Will hit limit in approximately X minutes.
IMMEDIATE ACTION:
1. Save any critical information now
2. Wrap up current task
3. Start fresh conversation
4. Copy important context to new chat

Proactive Triggers

Start-of-Conversation Check

code

When conversation begins:
- Note start time
- Set baseline (0 tokens)
- Plan check intervals

Mid-Conversation Checks

code

Every ~30 messages or 50k tokens:
- Calculate pace
- Project time to limit
- Warn if pace is unsustainable

Tool-Heavy Detection

code

If tools_per_message > 3:
  "You're using lots of tools - token usage accelerating!"
If MCP calls > 10:
  "Heavy MCP usage - monitor token budget closely"

Working with Your Patterns

Based on your usage, high-risk scenarios:

Long Development Sessions

•Multi-hour conversations about n8n workflows
•Iterative debugging with file operations
•Multiple MCP server interactions

Mitigation:

•Check tokens every 30 minutes
•Break into smaller conversations
•Save context between conversations

Skill Development Sessions (Like This One!)

•Creating/editing skills
•Testing workflows
•Git operations
•File system operations

Mitigation:

•Commit code frequently (saves state)
•Document decisions as you go
•Can resume from commit history

Integration with Other Skills

Works With skill-manager:

When creating skills in long conversations:

•Monitor token usage while developing
•Suggest commits when approaching limits
•Resume skill development in new chat if needed

Works With n8n-flow-builder:

When building n8n workflows:

•Track tokens during workflow design
•Warn before hitting limit mid-deployment
•Suggest saving workflow JSON before limit

Works With mcp-directory:

When using heavy MCPs (GitHub, Linear):

•Warn when loading token-heavy MCPs
•Track cumulative MCP usage
•Suggest lighter alternatives if approaching limit

User Preferences Integration

Kurt's patterns (for personalized warnings):

•Heavy tool user (Desktop Commander, MCPs)
•Long development sessions
•Multi-hour conversations common
•Gets caught at limits frequently

Personalized Strategy:

•Check every 40k tokens (more frequent)
•Warn earlier (at 60% instead of 70%)
•Emphasize "save context" reminders
•Suggest commits/saves for resumability

Implementation Notes

How This Works

•Skill loads at conversation start
•Claude tracks token usage (shown in UI)
•At checkpoints, Claude calculates pace
•Warns proactively before limits hit

Limitations

•Claude can see current token usage
•Cannot see daily/weekly message limits directly
•Must estimate based on user feedback about limits

Current Conversation Status

Token Budget: You have ~190k tokens total per conversation Your Current Usage: ~121k tokens (63%) Status: 🟡 Getting into the caution zone

At current pace:

•Started conversation: ~2-3 hours ago (estimated)
•Token rate: ~40-60k tokens/hour (tool-heavy)
•Projected remaining time: 1-2 hours before hitting limit

Recommendation for This Session:

•We're past halfway point
•Finish current skill work
•Commit everything to git
•Start fresh conversation for next major task

Example Interaction

code

Claude (checking tokens): 
"💡 Token check: We're at ~120k/190k tokens (63%).
At our current tool-heavy pace, we have about 1-2 hours left.
Let's finish the skill-manager enhancement, commit everything,
then start fresh for the YouTube extractor work. Sound good?"

Notes

•Created: 2025-10-18
•Author: Kurt Anderson
•Version: 1.0.0
•Solves: Getting caught at token limits mid-conversation
•Priority: HIGH - User gets blocked by this frequently