AgentSkillsCN

token-optimizer

通过优化提示词,在将请求发送至云端服务商之前,有效降低 LLM API 的使用成本。同时,采用主备管道机制,实现本地与远程代码代理的协同运作。

SKILL.md
--- frontmatter
name: token-optimizer
version: "0.1.0"
description: >
  Reduce LLM API costs by optimizing prompts before sending them to cloud providers.
  Coordinates local and remote code agents with a primary/fallback pipeline.
triggers:
  - "optimize prompt"
  - "reduce tokens"
  - "compress context"
  - "token budget"
  - "minimize API cost"
install: |
  git clone <repo-url> && cd TokenOptimizer
  cargo build --release
  # Binary: target/release/token_optimizer
requirements:
  os: linux, macos, windows
  binary: cargo (rustc 1.70+)
  optional_binary: ollama (for local LLM preprocessing)
env:
  VENICE_API_KEY: "Primary provider API key (Venice.ai)"
  ANTHROPIC_API_KEY: "Fallback provider API key (Claude) — optional"
  OPENAI_API_KEY: "Fallback provider API key (OpenAI) — optional"
configuration:
  file: "~/.config/token_optimizer/config.toml"
  sections:
    - primary: "Cloud provider (default: Venice.ai)"
    - fallback: "Backup provider (default: Claude CLI)"
    - local: "Ollama preprocessing settings"
    - optimization: "Strategy selection and tuning"
    - cache: "Anthropic cache prompting settings"
defaults:
  target_tokens: 4000
  strategies:
    - strip_whitespace
    - remove_comments
    - relevance_filter
  keyword_weight: 0.4
  preserve_code_blocks: true

TokenOptimizer

When to Use

  • You have a coding task and want to send it to an LLM API with less context (fewer tokens, lower cost).
  • You want automatic fallback from a cheap provider to a more capable one when credits run out.
  • You want local LLM preprocessing to score relevance and compress context before it hits a paid API.
  • You need to stay within a token budget while keeping the most important context.

Quick Start

bash
# Optimize a prompt with default strategies
token_optimizer optimize --input "Fix the bug in auth" --context src/auth.rs

# Analyze cache potential for Anthropic
token_optimizer cache-optimize --task "Add feature" --context types.rs --static-indices "0"

# Launch interactive shell (auto-selects provider)
token_optimizer interactive

# Show current config
token_optimizer config show primary

Capabilities

CapabilityDescription
StripWhitespaceRemove redundant whitespace, preserving code blocks
RemoveCommentsStrip //, /* */, # comments from code
TruncateContextBoundary-aware truncation using tiktoken token counts and priority-based boundary detection (code structure > paragraph > sentence > line > word)
AbbreviateShorten common programming terms in task text
LlmCompressCompress context via local Ollama LLM
RelevanceFilterHybrid keyword + LLM relevance scoring; works without local LLM via keyword-only mode
ExtractSignaturesKeep only function/class/struct signatures
DeduplicateRemove exact, whitespace-normalized, and near-duplicate context items
CachePromptingAnthropic-compatible cache breakpoints for static content
Provider FallbackAutomatic primary -> fallback -> local provider pipeline