Model Fallback Skill
Automatically switch to backup AI models when your primary model fails, times out, or degrades in performance.
Overview
This skill provides a robust fallback system for nanobot that:
- •Proactively monitors model health (response times, error rates)
- •Automatically switches to backup models when issues are detected
- •Reacts to failures and triggers model changes
- •Logs all events for analysis and debugging
- •Works with any model configured in your nanobot config
Features
| Feature | Description |
|---|---|
| Health Check Monitor | Runs periodically to test model performance |
| Automatic Fallback | Switches to next model in chain when thresholds exceeded |
| Reactive Trigger | Handles actual failures and crashes |
| Configurable Thresholds | Customize response time, error rate, and timeout limits |
| Cascading Fallback | Multiple backup models for maximum reliability |
| Comprehensive Logging | Track all health checks and fallback events |
Installation
Quick Install
cd /Users/awalker/.nanobot/workspace/skills/model-fallback ./install.sh
Manual Install
- •
Copy the skill to your workspace:
bashcp -r model-fallback ~/.nanobot/workspace/skills/
- •
Make scripts executable:
bashchmod +x ~/.nanobot/workspace/skills/model-fallback/scripts/*.py chmod +x ~/.nanobot/workspace/skills/model-fallback/install.sh
- •
Configure your nanobot config: Add fallback settings to
~/.nanobot/config.json:json{ "agents": { "defaults": { "model": "your-primary-model", "fallbackModels": [ "backup-model-1", "backup-model-2", "backup-model-3" ], "fallback": { "enabled": true, "healthCheckInterval": 300, "maxResponseTime": 30, "maxTimeoutRate": 0.2, "maxErrorRate": 0.1 } } } } - •
Start the health check monitor:
bashpython3 ~/.nanobot/workspace/skills/model-fallback/scripts/health-check.py start
Configuration
Fallback Chain
Configure backup models in priority order:
{
"fallbackModels": [
"openrouter/anthropic/claude-3.5-sonnet", # First fallback
"openrouter/glm-4.7", # Second fallback
"openrouter/google/gemini-2.0-flash-exp" # Third fallback
]
}
Health Check Settings
| Setting | Default | Description |
|---|---|---|
healthCheckInterval | 300 (5 min) | How often to check model health (seconds) |
maxResponseTime | 30 | Maximum acceptable response time (seconds) |
maxTimeoutRate | 0.2 (20%) | Maximum timeout rate before switching |
maxErrorRate | 0.1 (10%) | Maximum error rate before switching |
Example Config
{
"agents": {
"defaults": {
"model": "minimax/m2.5",
"fallbackModels": [
"openrouter/anthropic/claude-3.5-sonnet",
"openrouter/glm-4.7",
"openrouter/google/gemini-2.0-flash-exp:free"
],
"fallback": {
"enabled": true,
"healthCheckInterval": 300,
"maxResponseTime": 30,
"maxTimeoutRate": 0.2,
"maxErrorRate": 0.1
}
}
},
"providers": {
"minimax": {
"apiKey": "YOUR_MINIMAX_API_KEY"
},
"openrouter": {
"apiKey": "YOUR_OPENROUTER_API_KEY"
}
}
}
Usage
Start Health Check Monitor
python3 ~/.nanobot/workspace/skills/model-fallback/scripts/health-check.py start
Stop Health Check Monitor
python3 ~/.nanobot/workspace/skills/model-fallback/scripts/health-check.py stop
Check Fallback Status
python3 ~/.nanobot/workspace/skills/model-fallback/scripts/fallback-trigger.py status
Manually Trigger Fallback
# Test fallback with a specific reason python3 ~/.nanobot/workspace/skills/model-fallback/scripts/fallback-trigger.py trigger --reason "Manual test" # Trigger fallback without reason python3 ~/.nanobot/workspace/skills/model-fallback/scripts/fallback-trigger.py trigger
View Logs
# Health check logs tail -f ~/.nanobot/logs/model-health.log # Fallback event logs tail -f ~/.nanobot/logs/model-fallback.log # Combined logs tail -f ~/.nanobot/logs/model-*.log
How It Works
Proactive Monitoring (Health Check)
Every 5 minutes:
1. Send test prompt to current model
2. Measure response time
3. Check for errors/timeouts
4. Calculate error/timeout rates
5. If thresholds exceeded:
- Log warning
- Trigger fallback to next model
- Restart nanobot if needed
Reactive Fallback
When failure detected: 1. Identify current model 2. Select next model in fallback chain 3. Update config with new model 4. Log the switch 5. Trigger nanobot restart (via wrapper)
Restart Integration
This skill works best with the nanobot wrapper script for automatic restarts:
# Wrapper script monitors for restart trigger ~/nanobot-wrapper.sh # Fallback script creates trigger file touch /tmp/nanobot-restart
Supported Models
This skill works with any model configured in your nanobot:
Minimax
- •
minimax/m2.5 - •
minimax/m1 - •Other Minimax models
OpenRouter
- •
openrouter/anthropic/claude-3.5-sonnet - •
openrouter/google/gemini-2.0-flash-exp - •
openrouter/glm-4.7 - •Any OpenRouter-hosted model
Other Providers
Add your provider to config.json and use the model identifier in your fallback chain.
Troubleshooting
Health Check Not Running
# Check if process is running ps aux | grep health-check # Restart manually python3 ~/.nanobot/workspace/skills/model-fallback/scripts/health-check.py start
Fallback Not Triggering
- •Check config has
fallback.enabled: true - •Verify fallback models are listed
- •Check logs for errors:
bash
tail -50 ~/.nanobot/logs/model-fallback.log
Model Not Switching
- •Ensure nanobot is running with wrapper script
- •Check restart trigger exists:
bash
ls -la /tmp/nanobot-restart
- •Verify config file is writable:
bash
ls -la ~/.nanobot/config.json
API Key Issues
Make sure your provider API keys are configured in ~/.nanobot/config.json:
{
"providers": {
"minimax": {
"apiKey": "your-key-here"
},
"openrouter": {
"apiKey": "your-key-here"
}
}
}
Logs
Health Check Log
Location: ~/.nanobot/logs/model-health.log
Contains:
- •Health check timestamps
- •Response times
- •Error/timeout rates
- •Model performance metrics
Fallback Log
Location: ~/.nanobot/logs/model-fallback.log
Contains:
- •Fallback trigger events
- •Model switches
- •Reasons for switching
- •Config updates
Advanced Usage
Custom Health Check Prompt
Edit health-check.py to customize the test prompt:
TEST_PROMPT = "Respond with 'OK' if you're working."
Adjusting Sensitivity
For less sensitive monitoring (fewer switches):
{
"maxResponseTime": 60,
"maxTimeoutRate": 0.5,
"maxErrorRate": 0.3
}
For more sensitive monitoring (faster switches):
{
"maxResponseTime": 15,
"maxTimeoutRate": 0.1,
"maxErrorRate": 0.05
}
Multiple Instances
Run health checks for different models:
# Check primary model python3 health-check.py --model "minimax/m2.5" # Check fallback model python3 health-check.py --model "openrouter/claude-3.5-sonnet"
Contributing
To improve this skill:
- •Test with different model providers
- •Add new health check metrics
- •Improve error handling
- •Add more configuration options
- •Share your configurations!
License
This skill is provided as-is for use with nanobot. Feel free to modify and distribute.
Support
For issues or questions:
- •Check logs in
~/.nanobot/logs/model-*.log - •Verify your config is valid JSON
- •Ensure API keys are correct
- •Check nanobot is running with wrapper script
Changelog
v1.0.0 (2025-02-12)
- •Initial release
- •Health check monitoring
- •Automatic fallback
- •Reactive trigger
- •Comprehensive logging
- •Installation script