Token Optimization
Description
Patterns and techniques for reducing token usage while maintaining response quality. Achieve 30-70% cost savings through strategic output compression.
When to Use
- •High-volume development sessions
- •Repetitive tasks
- •Simple, clear requests
- •Cost-sensitive projects
- •Quick iterations
Compression Levels
Level 1: Concise (30-40% savings)
- •Remove conversational filler
- •Skip obvious explanations
- •Use bullet points
- •Shorter variable names in examples
Level 2: Compact (50-60% savings)
- •Code-only responses
- •No surrounding prose
- •Abbreviated comments
- •Reference docs instead of explaining
Level 3: Ultra (60-70% savings)
- •Minimal viable response
- •Essential code only
- •No comments
- •Diff format for changes
Compression Techniques
Remove Preambles
markdown
❌ VERBOSE: "I'll help you with that. Let me analyze the code and provide a solution. Based on what I see, the issue is..." ✅ CONCISE: "Issue: null check missing at line 42. Fix:"
Code-Only Responses
markdown
❌ VERBOSE: "Here's the implementation. I've added proper error handling and made sure to follow the existing patterns in your codebase. The function now validates input and returns early if invalid." [large code block] "This should fix the issue. Let me know if you have questions." ✅ CONCISE: [code block]
Reference Over Explain
markdown
❌ VERBOSE: "React's useEffect hook runs after render. The dependency array controls when it re-runs. Empty array means run once on mount..." ✅ CONCISE: "Add `userId` to deps array. See: https://react.dev/reference/react/useEffect"
Diff Format for Changes
markdown
❌ VERBOSE: "I've updated the file. Here's the complete new version:" [entire file] ✅ CONCISE: ```diff - const user = getUser(); + const user = getUser() ?? defaultUser;
Line 42 in user-service.ts
code
--- ## Output Templates ### Bug Fix
Fix: [brief description] File: [path:line] [code or diff] Verify: [test command]
code
### Feature Addition
Added: [feature] Files: [list] [code blocks] Test: [command]
code
### Refactor
Refactor: [what] [diff format changes] No behavior change.
code
--- ## When NOT to Compress | Situation | Why | |-----------|-----| | Complex architecture | Need full context | | Security issues | Must explain risks | | Code reviews | Thoroughness required | | Teaching/explaining | Clarity matters | | Debugging complex issues | Details help | | First-time patterns | Context needed | --- ## Activation ### Via Mode
Use mode: token-efficient
code
### Via Flag
/command --format=concise /command --format=ultra
code
### Session-Wide
For this session, use token-efficient mode.
code
--- ## Metrics ### Typical Savings by Task | Task Type | Verbose Tokens | Concise Tokens | Savings | |-----------|----------------|----------------|---------| | Bug fix | ~500 | ~150 | 70% | | Feature | ~2000 | ~800 | 60% | | Refactor | ~1000 | ~400 | 60% | | Explanation | ~800 | ~300 | 62% | ### ROI Calculation
Sessions per day: 10 Avg tokens per session: 50,000 With optimization: 25,000 Daily savings: 250,000 tokens Monthly savings: ~7.5M tokens
code
--- ## Best Practices 1. **Match compression to task complexity** - Simple task → High compression - Complex task → Lower compression 2. **Preserve essential information** - File paths always included - Test commands always included - Error context when relevant 3. **Use progressive disclosure** - Start concise - Expand if asked 4. **Know when to stop compressing** - User confusion → Add context - Errors occurring → Add detail - Review needed → Full output ---