Agentic Ecosystem Incremental Update

Safe deployment strategy for updating components in the agentic ecosystem without breaking existing configurations or services.

When to Use This Skill

•After initial deployment using agentic-ecosystem-remote-deployment
•Updating code without changing environment-specific configs
•Restarting services after crashes or changes
•Verifying deployments with optional browser testing

Universal Deployment Patterns

1. Pre-Deployment Safety Checklist

Before making any changes:

bash

# Check current state
ssh -p <PORT> <HOST> "cd ~/swe/<component> && git status --short && echo '---' && git log --oneline -3"

# Check running services
ssh -p <PORT> <HOST> "lsof -i :<PORT1> -i :<PORT2> | grep LISTEN"

# Backup .env
ssh -p <PORT> <HOST> "cd ~/swe/<component> && cp .env .env.backup-\$(date +%Y%m%d)"

# List .env backup history
ssh -p <PORT> <HOST> "ls -lt ~/swe/<component>/.env* | head -5"

Critical: Identify remote-specific configs that MUST NOT be overwritten

2. Critical .env Variables Protection Map

Universal pattern: These variables differ per remote and MUST be preserved:

Variable Category	Examples	Why Critical
Network binding	`SHIM_HOST`, `CODEX_SHIM_HOST`	cm3u=192.168.1.9, cm2=127.0.0.1
Port assignments	`SHIM_PORT`, `CODEX_SHIM_PORT`, `AGENT_HQ_UI_PORT`	May differ if multiple instances
Working directories	`SHIM_DEFAULT_CWD`	May point to different paths
Remote paths	`CLAUDE_CODE_CLI`, `CODEX_CLI`	Symlink targets may differ

Protection strategy:

•Before deployment: grep -E '^(SHIM_HOST|CODEX_SHIM_HOST|.*_PORT|.*_CWD)' .env > /tmp/critical-vars.txt
•After code update: Restore critical vars from backup
•Never deploy .env from local to remote without inspection

3. Safe Service Restart Checklist

Universal restart pattern:

bash

# 1. Kill old process cleanly
ssh <HOST> "pkill -f 'python.*<component>.*server.py'"

# 2. Wait for clean shutdown
sleep 2

# 3. Export PATH (CRITICAL for node access)
# 4. Use Anaconda Python (ALWAYS)
# 5. Start with nohup for persistence
ssh <HOST> "export PATH=\"/opt/homebrew/bin:\$PATH\" && \
  cd ~/swe/<component> && \
  nohup ~/anaconda3/bin/python src/<service>/server.py > /tmp/<service>.log 2>&1 &"

# 6. Wait for startup
sleep 3

# 7. Verify port is listening
ssh <HOST> "lsof -i :<PORT> | grep LISTEN"

Critical requirements:

•✅ Always export PATH="/opt/homebrew/bin:$PATH" (for node)
•✅ Always use ~/anaconda3/bin/python (not system python)
•✅ Pattern: pkill → sleep → export PATH → start with nohup → verify

4. Incremental Deployment Strategy

Pull changes without overwriting configs:

bash

# 1. Fetch remote changes (don't merge yet)
ssh <HOST> "cd ~/swe/<component> && git fetch origin"

# 2. See what will change
ssh <HOST> "cd ~/swe/<component> && git diff HEAD origin/main --stat"

# 3. Stash local .env changes
ssh <HOST> "cd ~/swe/<component> && git stash push .env"

# 4. Pull code changes
ssh <HOST> "cd ~/swe/<component> && git pull origin main"

# 5. Restore .env from backup (don't use stash - may have merge conflicts)
ssh <HOST> "cd ~/swe/<component> && cp .env.backup-YYYYMMDD .env"

# 6. Restart services (see section 3)

Alternative: Selective file updates without git:

•Use rsync with --exclude='.env' to update code only
•Preserves all local configs

5. Post-Deployment Verification Protocol

Verify deployment succeeded:

bash

# 1. Check all expected ports
ssh <HOST> "lsof -i :8787 -i :9288 -i :8037 | grep LISTEN"

# 2. Verify processes are using Anaconda Python
ssh <HOST> "ps aux | grep 'anaconda3.*python.*server.py' | grep -v grep"

# 3. Check logs for startup errors
ssh <HOST> "head -20 /tmp/claude-shim.log /tmp/codex-shim.log"

# 4. Test each service with curl
ssh <HOST> "curl -sS http://127.0.0.1:8787/ | head -3"
ssh <HOST> "curl -sS http://127.0.0.1:9288/ | head -3"

# 5. Document results
echo "Deployment verified at $(date)" >> deployment_logs/$(date +%Y_%m_%d).md

6. Interactive Tunnel Setup (Optional)

Ask user first:

"Do you want to create SSH tunnels to test the deployed services?"

If yes:

bash

# Use +20000 port offset to avoid local port conflicts
cd ~/swe/vscode-shims && \
  python launchers/launch_ssh_tunnel_to_m2_tmux.py --verbose \
  --map 28787:8787,29288:9288,28037:8037

Port mapping convention:

•Local service: 8787, 9288, 8037
•Remote tunnel: 28787, 29288, 28037

Verify tunnel:

bash

lsof -nP -iTCP:28787 -sTCP:LISTEN
curl -sS http://127.0.0.1:28787/ | head -3

7. Playwright Browser Testing (Optional)

Ask user first:

"Do you want me to test the services using Playwright?"

If yes:

bash

# Visit each tunneled service
playwright navigate http://localhost:28787  # Claude shim
playwright navigate http://localhost:29288  # Codex shim
playwright navigate http://localhost:28037  # Agent HQ

Check for errors:

•Red error banners
•Error text in UI
•Console errors
•Take screenshot if errors found

Report format:

•✅ "Claude shim accessible, no errors"
•❌ "Codex shim error: [Errno 2] No such file or directory: 'node'"

8. Rollback Strategy

Before deployment, note:

bash

ssh <HOST> "ls -ld ~/swe/<component>-old-*"
# /Users/m2/swe/vscode-shims-old-20260126

Emergency rollback:

bash

# 1. Kill current services
ssh <HOST> "pkill -f 'python.*<component>.*server.py'"

# 2. Restore old version
ssh <HOST> "cd ~/swe && mv <component> <component>-broken-\$(date +%Y%m%d) && \
  mv <component>-old-YYYYMMDD <component>"

# 3. Restart services
ssh <HOST> "export PATH=\"/opt/homebrew/bin:\$PATH\" && \
  cd ~/swe/<component> && \
  nohup ~/anaconda3/bin/python src/<service>/server.py > /tmp/<service>.log 2>&1 &"

Keep at least 2 previous versions on remote for safety.

9. Common Failure Patterns & Quick Fixes

"Failed to spawn CLI: [Errno 2] No such file or directory: 'node'"

Symptom: Error appears in webview UI, not SSH output Root Cause: /opt/homebrew/bin not in PATH when Python process starts Fix: Restart with PATH export (see section 3) Prevention: Always include export PATH="/opt/homebrew/bin:$PATH"

"Address already in use"

Symptom: Port binding fails during restart Root Cause: Old process not killed, or another service using port Fix:

bash

lsof -i :<PORT> | grep LISTEN
kill -9 <PID>

Prevention: Use pkill before starting new service

"Module not found" or "Import error"

Symptom: Python import failures in logs Root Cause: Using system Python instead of Anaconda Fix: Restart with ~/anaconda3/bin/python Prevention: Always use full Anaconda Python path

"Permission denied"

Symptom: Cannot execute files after transfer Root Cause: File permissions not preserved Fix:

bash

ssh <HOST> "chmod +x ~/swe/<component>/launchers/*.sh"

Prevention: Use rsync -a to preserve permissions

Component-Specific Configurations

vscode-shims

Critical .env variables:

•SHIM_HOST / CODEX_SHIM_HOST (network binding)
•SHIM_PORT=8787 / CODEX_SHIM_PORT=9288
•SHIM_DEFAULT_CWD (working directory)
•CLAUDE_CODE_CLI / CODEX_CLI (CLI paths)

Restart both services:

bash

# Claude shim
ssh <HOST> "pkill -f 'python.*claude.*server.py' && sleep 2 && \
  export PATH=\"/opt/homebrew/bin:\$PATH\" && \
  cd ~/swe/vscode-shims && \
  nohup ~/anaconda3/bin/python src/claude/server.py > /tmp/claude-shim.log 2>&1 &"

# Codex shim
ssh <HOST> "pkill -f 'python.*codex.*server.py' && sleep 2 && \
  export PATH=\"/opt/homebrew/bin:\$PATH\" && \
  cd ~/swe/vscode-shims && \
  nohup ~/anaconda3/bin/python src/codex/server.py > /tmp/codex-shim.log 2>&1 &"

Verify:

bash

ssh <HOST> "lsof -i :8787 -i :9288 | grep LISTEN"

Unique requirements:

•Node.js in PATH (for spawning Claude CLI)
•Anaconda Python 3.10+ (for | union syntax in codex/server.py)

Agent HQ (Future)

Critical .env variables:

•AMS_TMUX_PORT or AGENT_MGMT_PORT
•AGENT_HQ_UI_PORT or VITE_PORT
•Network binding settings

Restart pattern:

bash

# Must source .env before starting
ssh <HOST> "export PATH=\"/opt/homebrew/bin:\$PATH\" && \
  cd ~/AgenticProjects/agent-box-v1 && \
  set -a && source .env && set +a && \
  cd apps/agent-hq-ui && \
  npm run dev:web -- --port 8037 --host 127.0.0.1"

Unique requirements:

•npm dependencies installed
•NEVER run Electron remotely (only web version)
•Must source .env before starting vite

Other Components

As ecosystem grows, add component-specific sections here:

•Telemetry projects (SQLite ingestors)
•Custom launchers
•Background services

Quick Reference: Remote Registry

Remote	SHIM_HOST	Access Method	Notes
cm3u	192.168.1.9	Direct LAN	Mac Studio, direct access
cm2	127.0.0.1	SSH tunnel	MacBook Pro, tunnel required

Tunnel port mapping:

•cm2:8787 → localhost:28787
•cm2:9288 → localhost:29288
•cm2:8037 → localhost:28037

Workflow Summary

•Safety check → backup .env, check running services
•Incremental update → fetch, diff, pull (or rsync code only)
•Preserve configs → restore .env from backup
•Safe restart → pkill, export PATH, Anaconda Python, nohup
•Verify → ports listening, processes correct, logs clean, curl test
•Optional: Tunnel → ask user, launch with +20000 offset
•Optional: Test → ask user, Playwright visit mapped ports
•Document → update deployment_logs

Created by: Claude [e8fa7e09-6d5f-40f1-89df-3afe03f29ca1] Date: 2026-01-26 Pairs with: agentic-ecosystem-remote-deployment