Prompt Compression Skill
Capabilities
- •Implement token-efficient prompt compression
- •Design context pruning strategies
- •Configure selective context inclusion
- •Implement LLMLingua-style compression
- •Design summary-based compression
- •Create compression quality metrics
Target Processes
- •cost-optimization-llm
- •agent-performance-optimization
Implementation Details
Compression Techniques
- •LLMLingua: Token-level compression
- •Summary Compression: LLM-based summarization
- •Selective Context: Relevant section extraction
- •Token Pruning: Remove low-importance tokens
- •Document Filtering: Pre-retrieval filtering
Configuration Options
- •Compression ratio targets
- •Quality threshold settings
- •Token budget constraints
- •Compression model selection
- •Evaluation metrics
Best Practices
- •Monitor quality vs compression tradeoff
- •Test with representative prompts
- •Set appropriate compression ratios
- •Validate compressed prompt quality
- •Track cost savings
Dependencies
- •llmlingua (optional)
- •tiktoken
- •transformers