model-router
Dynamic model switching based on task classification.
Installation
bash
openclaw skill install model-router
Usage
Automatic Routing (via trigger)
Prefix your message with intent:
| Prefix | Model | Use Case |
|---|---|---|
/code | codex | Programming, debugging, refactoring |
/research | kimi | Long context, web search, analysis |
/deep | opus | Complex reasoning, architecture, security |
/fast | haiku | Quick tasks, summaries, simple Q&A |
/cheap | kimi-free | Cost-sensitive, low priority |
Example:
code
/code Review this Python function for bugs /research What are the latest developments in LLM agents? /deep Design a distributed system for real-time event processing
Direct Switching
code
/model codex # Switch to Codex /model kimi # Switch to Kimi K2.5 /model opus # Switch to Opus /model haiku # Switch to Haiku
Sub-agent Spawning (Isolated Context)
For complex multi-model workflows:
javascript
// From any session
sessions_spawn({
task: "Analyze this codebase for security issues",
model: "opus",
thinking: "high"
})
Configuration
Edit ~/.openclaw/skills/model-router/config.json:
json
{
"default": "kimi",
"autoRoute": true,
"routes": {
"code": {
"model": "codex",
"patterns": ["code", "function", "bug", "refactor", "implement"],
"threshold": 0.7
},
"research": {
"model": "kimi",
"patterns": ["research", "find", "search", "latest", "what is"],
"threshold": 0.6
},
"complex": {
"model": "opus",
"patterns": ["design", "architecture", "analyze", "security audit"],
"threshold": 0.8
}
},
"costLimits": {
"dailyOpusTokens": 100000,
"alertThreshold": 0.8
}
}
How It Works
- •Intent Detection: Simple keyword/pattern matching on user input
- •Cost Awareness: Tracks daily spend per model, suggests cheaper alternatives when approaching limits
- •Context Preservation: Main session switches model; sub-agents get isolated context with specific model
- •Fallback Chain: If model unavailable → fallback to next best option
Architecture
code
User Input
↓
Intent Classifier (keyword-based, lightweight)
↓
├─→ Code detected → Spawn Codex agent OR switch to Codex
├─→ Research detected → Switch to Kimi (long context)
├─→ Complex detected → Switch to Opus (reasoning)
└─→ Default → Stay on current model
↓
Cost Check (if high-cost model requested)
↓
Execute with selected model
Advanced: Multi-Agent Workflows
javascript
// Parallel execution with different models
const results = await Promise.all([
sessions_spawn({ task: "Code review", model: "codex" }),
sessions_spawn({ task: "Security check", model: "opus" }),
sessions_spawn({ task: "Documentation", model: "kimi" })
]);
Integration (Option A: Pre-processor)
The agent detects prefixes and switches models using session_status().
Pattern:
javascript
// 1. Check for explicit prefix
if (message.startsWith('/code')) {
session_status({ model: 'codex' });
}
if (message.startsWith('/deep')) {
session_status({ model: 'opus' });
}
if (message.startsWith('/research')) {
session_status({ model: 'kimi' });
}
// 2. Process task with selected model
// ...
For coding tasks (spawn sub-agent):
javascript
sessions_spawn({
task: "Review this code for bugs",
model: "codex"
});
Key principle: Agent-controlled, no magic. User sees what model runs.
Precedence Rules
When multiple keywords match, the router uses this priority:
code
1. Explicit prefix (/code, /deep, /research) → ALWAYS WINS 2. opus (architecture, complex reasoning) 3. codex (code, debugging) 4. kimi (research, general) 5. default → kimi
Example: "design the code architecture" → opus (not codex, because opus is higher priority)
Why Not Auto-Detect Everything?
Explicit routing > magic:
- •Predictable costs — you know which model runs
- •Consistent behavior — no surprise switches mid-conversation
- •User control — override anytime with
/model - •Simple — no LLM-based classification overhead
Future Ideas
- • Usage analytics dashboard (
/router stats) - • Smart fallback when rate-limited
- • "Best of 3" ensemble (run task on 3 models, vote on best)
- • Automatic escalation (cheap model fails → retry with better model)