Agentic Ecosystem Incremental Update
Safe deployment strategy for updating components in the agentic ecosystem without breaking existing configurations or services.
When to Use This Skill
- •After initial deployment using
agentic-ecosystem-remote-deployment - •Updating code without changing environment-specific configs
- •Restarting services after crashes or changes
- •Verifying deployments with optional browser testing
Universal Deployment Patterns
1. Pre-Deployment Safety Checklist
Before making any changes:
# Check current state ssh -p <PORT> <HOST> "cd ~/swe/<component> && git status --short && echo '---' && git log --oneline -3" # Check running services ssh -p <PORT> <HOST> "lsof -i :<PORT1> -i :<PORT2> | grep LISTEN" # Backup .env ssh -p <PORT> <HOST> "cd ~/swe/<component> && cp .env .env.backup-\$(date +%Y%m%d)" # List .env backup history ssh -p <PORT> <HOST> "ls -lt ~/swe/<component>/.env* | head -5"
Critical: Identify remote-specific configs that MUST NOT be overwritten
2. Critical .env Variables Protection Map
Universal pattern: These variables differ per remote and MUST be preserved:
| Variable Category | Examples | Why Critical |
|---|---|---|
| Network binding | SHIM_HOST, CODEX_SHIM_HOST | cm3u=192.168.1.9, cm2=127.0.0.1 |
| Port assignments | SHIM_PORT, CODEX_SHIM_PORT, AGENT_HQ_UI_PORT | May differ if multiple instances |
| Working directories | SHIM_DEFAULT_CWD | May point to different paths |
| Remote paths | CLAUDE_CODE_CLI, CODEX_CLI | Symlink targets may differ |
Protection strategy:
- •Before deployment:
grep -E '^(SHIM_HOST|CODEX_SHIM_HOST|.*_PORT|.*_CWD)' .env > /tmp/critical-vars.txt - •After code update: Restore critical vars from backup
- •Never deploy .env from local to remote without inspection
3. Safe Service Restart Checklist
Universal restart pattern:
# 1. Kill old process cleanly ssh <HOST> "pkill -f 'python.*<component>.*server.py'" # 2. Wait for clean shutdown sleep 2 # 3. Export PATH (CRITICAL for node access) # 4. Use Anaconda Python (ALWAYS) # 5. Start with nohup for persistence ssh <HOST> "export PATH=\"/opt/homebrew/bin:\$PATH\" && \ cd ~/swe/<component> && \ nohup ~/anaconda3/bin/python src/<service>/server.py > /tmp/<service>.log 2>&1 &" # 6. Wait for startup sleep 3 # 7. Verify port is listening ssh <HOST> "lsof -i :<PORT> | grep LISTEN"
Critical requirements:
- •✅ Always export
PATH="/opt/homebrew/bin:$PATH"(for node) - •✅ Always use
~/anaconda3/bin/python(not system python) - •✅ Pattern:
pkill → sleep → export PATH → start with nohup → verify
4. Incremental Deployment Strategy
Pull changes without overwriting configs:
# 1. Fetch remote changes (don't merge yet) ssh <HOST> "cd ~/swe/<component> && git fetch origin" # 2. See what will change ssh <HOST> "cd ~/swe/<component> && git diff HEAD origin/main --stat" # 3. Stash local .env changes ssh <HOST> "cd ~/swe/<component> && git stash push .env" # 4. Pull code changes ssh <HOST> "cd ~/swe/<component> && git pull origin main" # 5. Restore .env from backup (don't use stash - may have merge conflicts) ssh <HOST> "cd ~/swe/<component> && cp .env.backup-YYYYMMDD .env" # 6. Restart services (see section 3)
Alternative: Selective file updates without git:
- •Use rsync with
--exclude='.env'to update code only - •Preserves all local configs
5. Post-Deployment Verification Protocol
Verify deployment succeeded:
# 1. Check all expected ports ssh <HOST> "lsof -i :8787 -i :9288 -i :8037 | grep LISTEN" # 2. Verify processes are using Anaconda Python ssh <HOST> "ps aux | grep 'anaconda3.*python.*server.py' | grep -v grep" # 3. Check logs for startup errors ssh <HOST> "head -20 /tmp/claude-shim.log /tmp/codex-shim.log" # 4. Test each service with curl ssh <HOST> "curl -sS http://127.0.0.1:8787/ | head -3" ssh <HOST> "curl -sS http://127.0.0.1:9288/ | head -3" # 5. Document results echo "Deployment verified at $(date)" >> deployment_logs/$(date +%Y_%m_%d).md
6. Interactive Tunnel Setup (Optional)
Ask user first:
"Do you want to create SSH tunnels to test the deployed services?"
If yes:
# Use +20000 port offset to avoid local port conflicts cd ~/swe/vscode-shims && \ python launchers/launch_ssh_tunnel_to_m2_tmux.py --verbose \ --map 28787:8787,29288:9288,28037:8037
Port mapping convention:
- •Local service: 8787, 9288, 8037
- •Remote tunnel: 28787, 29288, 28037
Verify tunnel:
lsof -nP -iTCP:28787 -sTCP:LISTEN curl -sS http://127.0.0.1:28787/ | head -3
7. Playwright Browser Testing (Optional)
Ask user first:
"Do you want me to test the services using Playwright?"
If yes:
# Visit each tunneled service playwright navigate http://localhost:28787 # Claude shim playwright navigate http://localhost:29288 # Codex shim playwright navigate http://localhost:28037 # Agent HQ
Check for errors:
- •Red error banners
- •Error text in UI
- •Console errors
- •Take screenshot if errors found
Report format:
- •✅ "Claude shim accessible, no errors"
- •❌ "Codex shim error: [Errno 2] No such file or directory: 'node'"
8. Rollback Strategy
Before deployment, note:
ssh <HOST> "ls -ld ~/swe/<component>-old-*" # /Users/m2/swe/vscode-shims-old-20260126
Emergency rollback:
# 1. Kill current services ssh <HOST> "pkill -f 'python.*<component>.*server.py'" # 2. Restore old version ssh <HOST> "cd ~/swe && mv <component> <component>-broken-\$(date +%Y%m%d) && \ mv <component>-old-YYYYMMDD <component>" # 3. Restart services ssh <HOST> "export PATH=\"/opt/homebrew/bin:\$PATH\" && \ cd ~/swe/<component> && \ nohup ~/anaconda3/bin/python src/<service>/server.py > /tmp/<service>.log 2>&1 &"
Keep at least 2 previous versions on remote for safety.
9. Common Failure Patterns & Quick Fixes
"Failed to spawn CLI: [Errno 2] No such file or directory: 'node'"
Symptom: Error appears in webview UI, not SSH output
Root Cause: /opt/homebrew/bin not in PATH when Python process starts
Fix: Restart with PATH export (see section 3)
Prevention: Always include export PATH="/opt/homebrew/bin:$PATH"
"Address already in use"
Symptom: Port binding fails during restart Root Cause: Old process not killed, or another service using port Fix:
lsof -i :<PORT> | grep LISTEN kill -9 <PID>
Prevention: Use pkill before starting new service
"Module not found" or "Import error"
Symptom: Python import failures in logs
Root Cause: Using system Python instead of Anaconda
Fix: Restart with ~/anaconda3/bin/python
Prevention: Always use full Anaconda Python path
"Permission denied"
Symptom: Cannot execute files after transfer Root Cause: File permissions not preserved Fix:
ssh <HOST> "chmod +x ~/swe/<component>/launchers/*.sh"
Prevention: Use rsync -a to preserve permissions
Component-Specific Configurations
vscode-shims
Critical .env variables:
- •
SHIM_HOST/CODEX_SHIM_HOST(network binding) - •
SHIM_PORT=8787/CODEX_SHIM_PORT=9288 - •
SHIM_DEFAULT_CWD(working directory) - •
CLAUDE_CODE_CLI/CODEX_CLI(CLI paths)
Restart both services:
# Claude shim ssh <HOST> "pkill -f 'python.*claude.*server.py' && sleep 2 && \ export PATH=\"/opt/homebrew/bin:\$PATH\" && \ cd ~/swe/vscode-shims && \ nohup ~/anaconda3/bin/python src/claude/server.py > /tmp/claude-shim.log 2>&1 &" # Codex shim ssh <HOST> "pkill -f 'python.*codex.*server.py' && sleep 2 && \ export PATH=\"/opt/homebrew/bin:\$PATH\" && \ cd ~/swe/vscode-shims && \ nohup ~/anaconda3/bin/python src/codex/server.py > /tmp/codex-shim.log 2>&1 &"
Verify:
ssh <HOST> "lsof -i :8787 -i :9288 | grep LISTEN"
Unique requirements:
- •Node.js in PATH (for spawning Claude CLI)
- •Anaconda Python 3.10+ (for
|union syntax in codex/server.py)
Agent HQ (Future)
Critical .env variables:
- •
AMS_TMUX_PORTorAGENT_MGMT_PORT - •
AGENT_HQ_UI_PORTorVITE_PORT - •Network binding settings
Restart pattern:
# Must source .env before starting ssh <HOST> "export PATH=\"/opt/homebrew/bin:\$PATH\" && \ cd ~/AgenticProjects/agent-box-v1 && \ set -a && source .env && set +a && \ cd apps/agent-hq-ui && \ npm run dev:web -- --port 8037 --host 127.0.0.1"
Unique requirements:
- •npm dependencies installed
- •NEVER run Electron remotely (only web version)
- •Must source .env before starting vite
Other Components
As ecosystem grows, add component-specific sections here:
- •Telemetry projects (SQLite ingestors)
- •Custom launchers
- •Background services
Quick Reference: Remote Registry
| Remote | SHIM_HOST | Access Method | Notes |
|---|---|---|---|
| cm3u | 192.168.1.9 | Direct LAN | Mac Studio, direct access |
| cm2 | 127.0.0.1 | SSH tunnel | MacBook Pro, tunnel required |
Tunnel port mapping:
- •cm2:8787 → localhost:28787
- •cm2:9288 → localhost:29288
- •cm2:8037 → localhost:28037
Workflow Summary
- •Safety check → backup .env, check running services
- •Incremental update → fetch, diff, pull (or rsync code only)
- •Preserve configs → restore .env from backup
- •Safe restart → pkill, export PATH, Anaconda Python, nohup
- •Verify → ports listening, processes correct, logs clean, curl test
- •Optional: Tunnel → ask user, launch with +20000 offset
- •Optional: Test → ask user, Playwright visit mapped ports
- •Document → update deployment_logs
Created by: Claude [e8fa7e09-6d5f-40f1-89df-3afe03f29ca1] Date: 2026-01-26 Pairs with: agentic-ecosystem-remote-deployment