Tailscale SSH Sync Agent
When to Use This Skill
This skill automatically activates when you need to:
✅ Distribute workloads across multiple machines
- •"Run this on my least loaded machine"
- •"Execute this task on the machine with most resources"
- •"Balance work across my Tailscale network"
✅ Share files between Tailscale-connected hosts
- •"Push this directory to all my development machines"
- •"Sync code across my homelab servers"
- •"Deploy configuration to production group"
✅ Execute commands remotely across host groups
- •"Run system updates on all servers"
- •"Check disk space across web-servers group"
- •"Restart services on database hosts"
✅ Monitor machine availability and health
- •"Which machines are online?"
- •"Show status of my Tailscale network"
- •"Check connectivity to remote hosts"
✅ Automate multi-machine workflows
- •"Deploy to staging, test, then production"
- •"Backup files from all machines"
- •"Synchronize development environment across laptops"
How It Works
This agent provides intelligent workload distribution and file sharing management across Tailscale SSH-connected machines using the sshsync CLI tool.
Core Architecture:
- •SSH Sync Wrapper: Python interface to sshsync CLI operations
- •Tailscale Manager: Tailscale-specific connectivity and status management
- •Load Balancer: Intelligent task distribution based on machine resources
- •Workflow Executor: Common multi-machine workflow automation
- •Validators: Parameter, host, and connection validation
- •Helpers: Temporal context, formatting, and utilities
Key Features:
- •Automatic host discovery via Tailscale and SSH config
- •Intelligent load balancing based on CPU, memory, and current load
- •Group-based operations (execute on all web servers, databases, etc.)
- •Dry-run mode for preview before execution
- •Parallel execution across multiple hosts
- •Comprehensive error handling and retry logic
- •Connection validation before operations
- •Progress tracking for long-running operations
Data Sources
sshsync CLI Tool
What is sshsync?
sshsync is a Python CLI tool for managing SSH connections and executing operations across multiple hosts. It provides:
- •Group-based host management
- •Remote command execution with timeouts
- •File push/pull operations (single or recursive)
- •Integration with existing SSH config (~/.ssh/config)
- •Status checking and connectivity validation
Installation:
pip install sshsync
Configuration:
sshsync uses two configuration sources:
- •SSH Config (
~/.ssh/config): Host connection details - •sshsync Config (
~/.config/sshsync/config.yaml): Group assignments
Example SSH Config:
Host homelab-1 HostName 100.64.1.10 User admin IdentityFile ~/.ssh/id_ed25519 Host prod-web-01 HostName 100.64.1.20 User deploy Port 22
Example sshsync Config:
groups:
homelab:
- homelab-1
- homelab-2
production:
- prod-web-01
- prod-web-02
- prod-db-01
development:
- dev-laptop
- dev-desktop
sshsync Commands Used:
| Command | Purpose | Example |
|---|---|---|
sshsync all | Execute on all hosts | sshsync all "df -h" |
sshsync group | Execute on group | sshsync group web "systemctl status nginx" |
sshsync push | Push files to hosts | sshsync push --group prod ./app /var/www/ |
sshsync pull | Pull files from hosts | sshsync pull --host db /var/log/mysql ./logs/ |
sshsync ls | List hosts | sshsync ls --with-status |
sshsync sync | Sync ungrouped hosts | sshsync sync |
Tailscale Integration
What is Tailscale?
Tailscale is a zero-config VPN that creates a secure network between your devices. It provides:
- •Automatic peer-to-peer connections via WireGuard
- •Magic DNS for easy host addressing (e.g.,
machine-name.tailnet-name.ts.net) - •SSH capabilities built-in to Tailscale CLI
- •ACLs for access control
Tailscale SSH:
Tailscale includes SSH functionality that works seamlessly with standard SSH:
# Standard SSH via Tailscale ssh user@machine-name # Tailscale-specific SSH command tailscale ssh machine-name
Integration with sshsync:
Since Tailscale SSH uses standard SSH protocol, it works perfectly with sshsync. Just configure your SSH config with Tailscale hostnames:
Host homelab-1 HostName homelab-1.tailnet.ts.net User admin
Tailscale Commands Used:
| Command | Purpose | Example |
|---|---|---|
tailscale status | Show network status | Lists all connected machines |
tailscale ping | Check connectivity | tailscale ping machine-name |
tailscale ssh | SSH to machine | tailscale ssh user@machine |
Workflows
1. Host Health Monitoring
User Query: "Which of my machines are online?"
Workflow:
- •Load SSH config and sshsync groups
- •Execute
sshsync ls --with-status - •Parse connectivity results
- •Query Tailscale status for additional context
- •Return formatted health report with:
- •Online/offline status per host
- •Group memberships
- •Tailscale connection state
- •Last seen timestamp
Implementation: scripts/sshsync_wrapper.py → get_host_status()
Output Format:
🟢 homelab-1 (homelab) - Online - Tailscale: Connected 🟢 prod-web-01 (production, web-servers) - Online - Tailscale: Connected 🔴 dev-laptop (development) - Offline - Last seen: 2h ago 🟢 prod-db-01 (production, databases) - Online - Tailscale: Connected Summary: 3/4 hosts online (75%)
2. Intelligent Load Balancing
User Query: "Run this task on the least loaded machine"
Workflow:
- •Get list of candidate hosts (from group or all)
- •For each online host, check:
- •CPU load (via
uptimeortop) - •Memory usage (via
freeorvm_stat) - •Disk space (via
df)
- •CPU load (via
- •Calculate composite load score
- •Select host with lowest score
- •Execute task on selected host
- •Return result with performance metrics
Implementation: scripts/load_balancer.py → select_optimal_host()
Load Score Calculation:
score = (cpu_pct * 0.4) + (mem_pct * 0.3) + (disk_pct * 0.3)
Lower score = better candidate for task execution.
Output Format:
✓ Selected host: prod-web-02 Reason: Lowest load score (0.32) - CPU: 15% (vs avg 45%) - Memory: 30% (vs avg 60%) - Disk: 40% (vs avg 55%) Executing: npm run build [Task output...] ✓ Completed in 2m 15s
3. File Synchronization Workflows
User Query: "Sync my code to all development machines"
Workflow:
- •Validate source path exists locally
- •Identify target group ("development")
- •Check connectivity to all group members
- •Show dry-run preview (files to be synced, sizes)
- •Execute parallel push to all hosts
- •Validate successful transfer on each host
- •Return summary with per-host status
Implementation: scripts/sshsync_wrapper.py → push_to_group()
Supported Operations:
- •Push to all: Sync files to every configured host
- •Push to group: Sync to specific group (dev, prod, etc.)
- •Pull from host: Retrieve files from single host
- •Pull from group: Collect files from multiple hosts
- •Recursive sync: Entire directory trees with
--recurse
Output Format:
📤 Syncing: ~/projects/myapp → /var/www/myapp Group: development (3 hosts) Preview (dry-run): - dev-laptop: 145 files, 12.3 MB - dev-desktop: 145 files, 12.3 MB - dev-server: 145 files, 12.3 MB Execute? [Proceeding...] ✓ dev-laptop: Synced 145 files in 8s ✓ dev-desktop: Synced 145 files in 6s ✓ dev-server: Synced 145 files in 10s Summary: 3/3 successful (435 files, 36.9 MB total)
4. Remote Command Orchestration
User Query: "Check disk space on all web servers"
Workflow:
- •Identify target group ("web-servers")
- •Validate group exists and has members
- •Check connectivity to group members
- •Execute command in parallel across group
- •Collect and parse outputs
- •Format results with per-host breakdown
Implementation: scripts/sshsync_wrapper.py → execute_on_group()
Features:
- •Parallel execution: Commands run simultaneously on all hosts
- •Timeout handling: Configurable per-command timeout (default 10s)
- •Error isolation: Failure on one host doesn't stop others
- •Output aggregation: Collect and correlate all outputs
- •Dry-run mode: Preview what would execute without running
Output Format:
🔧 Executing on group 'web-servers': df -h /var/www web-01: Filesystem: /dev/sda1 Size: 100G, Used: 45G, Available: 50G (45% used) web-02: Filesystem: /dev/sda1 Size: 100G, Used: 67G, Available: 28G (67% used) ⚠️ web-03: Filesystem: /dev/sda1 Size: 100G, Used: 52G, Available: 43G (52% used) ⚠️ Alert: web-02 is above 60% disk usage
5. Multi-Stage Deployment Workflow
User Query: "Deploy to staging, test, then production"
Workflow:
- •
Stage 1 - Staging Deploy:
- •Push code to staging group
- •Run build process
- •Execute automated tests
- •If tests fail: STOP and report error
- •
Stage 2 - Validation:
- •Check staging health endpoints
- •Validate database migrations
- •Run smoke tests
- •
Stage 3 - Production Deploy:
- •Push to production group (one at a time for zero-downtime)
- •Restart services gracefully
- •Verify each host before proceeding to next
- •
Stage 4 - Verification:
- •Check production health
- •Monitor for errors
- •Rollback if issues detected
Implementation: scripts/workflow_executor.py → deploy_workflow()
Output Format:
🚀 Multi-Stage Deployment Workflow Stage 1: Staging Deployment ✓ Pushed code to staging-01 ✓ Build completed (2m 15s) ✓ Tests passed (145/145) Stage 2: Validation ✓ Health check passed ✓ Database migration OK ✓ Smoke tests passed (12/12) Stage 3: Production Deployment ✓ prod-web-01: Deployed & verified ✓ prod-web-02: Deployed & verified ✓ prod-web-03: Deployed & verified Stage 4: Verification ✓ All health checks passed ✓ No errors in logs (5min window) ✅ Deployment completed successfully in 12m 45s
Available Scripts
scripts/sshsync_wrapper.py
Purpose: Python wrapper around sshsync CLI for programmatic access
Functions:
- •
get_host_status(group=None): Get online/offline status of hosts - •
execute_on_all(command, timeout=10, dry_run=False): Run command on all hosts - •
execute_on_group(group, command, timeout=10, dry_run=False): Run on specific group - •
execute_on_host(host, command, timeout=10): Run on single host - •
push_to_hosts(local_path, remote_path, hosts=None, group=None, recurse=False, dry_run=False): Push files - •
pull_from_host(host, remote_path, local_path, recurse=False, dry_run=False): Pull files - •
list_hosts(with_status=True): List all configured hosts - •
get_groups(): Get all defined groups and their members - •
add_hosts_to_group(group, hosts): Add hosts to a group
Usage Example:
from sshsync_wrapper import execute_on_group, push_to_hosts
# Execute command
result = execute_on_group(
group="web-servers",
command="systemctl status nginx",
timeout=15
)
# Push files
push_to_hosts(
local_path="./dist",
remote_path="/var/www/app",
group="production",
recurse=True
)
scripts/tailscale_manager.py
Purpose: Tailscale-specific operations and status management
Functions:
- •
get_tailscale_status(): Get Tailscale network status (all peers) - •
check_connectivity(host): Ping host via Tailscale - •
get_peer_info(hostname): Get detailed info about peer - •
list_online_machines(): List all online Tailscale machines - •
get_machine_ip(hostname): Get Tailscale IP for machine - •
validate_tailscale_ssh(host): Check if Tailscale SSH is working
Usage Example:
from tailscale_manager import get_tailscale_status, check_connectivity
# Get network status
status = get_tailscale_status()
print(f"Online machines: {status['online_count']}")
# Check specific host
is_online = check_connectivity("homelab-1")
scripts/load_balancer.py
Purpose: Intelligent task distribution based on machine resources
Functions:
- •
get_machine_load(host): Get CPU, memory, disk metrics - •
calculate_load_score(metrics): Calculate composite load score - •
select_optimal_host(candidates, prefer_group=None): Pick best host - •
get_group_capacity(): Get aggregate capacity of group - •
distribute_tasks(tasks, hosts): Distribute multiple tasks optimally
Usage Example:
from load_balancer import select_optimal_host
# Find best machine for task
best_host = select_optimal_host(
candidates=["web-01", "web-02", "web-03"],
prefer_group="production"
)
# Execute on selected host
execute_on_host(best_host, "npm run build")
scripts/workflow_executor.py
Purpose: Common multi-machine workflow automation
Functions:
- •
deploy_workflow(code_path, staging_group, prod_group): Full deployment pipeline - •
backup_workflow(hosts, backup_paths, destination): Backup from multiple hosts - •
sync_workflow(source_host, target_group, paths): Sync from one to many - •
rolling_restart(group, service_name): Zero-downtime service restart - •
health_check_workflow(group, endpoint): Check health across group
Usage Example:
from workflow_executor import deploy_workflow, backup_workflow
# Deploy with testing
deploy_workflow(
code_path="./dist",
staging_group="staging",
prod_group="production"
)
# Backup from all databases
backup_workflow(
hosts=["db-01", "db-02"],
backup_paths=["/var/lib/mysql"],
destination="./backups"
)
scripts/utils/helpers.py
Purpose: Common utilities and formatting functions
Functions:
- •
format_bytes(bytes): Human-readable byte formatting (1.2 GB) - •
format_duration(seconds): Human-readable duration (2m 15s) - •
parse_ssh_config(): Parse ~/.ssh/config for host details - •
parse_sshsync_config(): Parse sshsync group configuration - •
get_timestamp(): Get ISO timestamp for logging - •
safe_execute(func, *args, **kwargs): Execute with error handling - •
validate_path(path): Check if path exists and is accessible
scripts/utils/validators/parameter_validator.py
Purpose: Validate user inputs and parameters
Functions:
- •
validate_host(host, valid_hosts=None): Validate host exists - •
validate_group(group, valid_groups=None): Validate group exists - •
validate_path_exists(path): Check local path exists - •
validate_timeout(timeout): Ensure timeout is reasonable - •
validate_command(command): Basic command safety validation
scripts/utils/validators/host_validator.py
Purpose: Validate host configuration and availability
Functions:
- •
validate_ssh_config(host): Check host has SSH config entry - •
validate_host_reachable(host, timeout=5): Check host is reachable - •
validate_group_members(group): Ensure group has valid members - •
get_invalid_hosts(hosts): Find hosts without valid config
scripts/utils/validators/connection_validator.py
Purpose: Validate SSH and Tailscale connections
Functions:
- •
validate_ssh_connection(host): Test SSH connection works - •
validate_tailscale_connection(host): Test Tailscale connectivity - •
validate_ssh_key(host): Check SSH key authentication - •
get_connection_diagnostics(host): Comprehensive connection testing
Available Analyses
1. Host Availability Analysis
Function: analyze_host_availability(group=None)
Objective: Determine which machines are online and accessible
Inputs:
- •
group(optional): Specific group to check (None = all hosts)
Outputs:
{
'total_hosts': 10,
'online_hosts': 8,
'offline_hosts': 2,
'availability_pct': 80.0,
'by_group': {
'production': {'online': 3, 'total': 3, 'pct': 100.0},
'development': {'online': 2, 'total': 3, 'pct': 66.7},
'homelab': {'online': 3, 'total': 4, 'pct': 75.0}
},
'offline_hosts_details': [
{'host': 'dev-laptop', 'last_seen': '2h ago', 'groups': ['development']},
{'host': 'homelab-4', 'last_seen': '1d ago', 'groups': ['homelab']}
]
}
Interpretation:
- •> 90%: Excellent availability
- •70-90%: Good availability, monitor offline hosts
- •< 70%: Poor availability, investigate issues
2. Load Distribution Analysis
Function: analyze_load_distribution(group=None)
Objective: Understand resource usage across machines
Inputs:
- •
group(optional): Specific group to analyze
Outputs:
{
'hosts': [
{
'host': 'web-01',
'cpu_pct': 45,
'mem_pct': 60,
'disk_pct': 40,
'load_score': 0.49,
'status': 'moderate'
},
# ... more hosts
],
'aggregate': {
'avg_cpu': 35,
'avg_mem': 55,
'avg_disk': 45,
'total_capacity': 1200 # GB
},
'recommendations': [
{
'host': 'web-02',
'issue': 'High CPU usage (85%)',
'action': 'Consider migrating workloads'
}
]
}
Load Status:
- •Low (score < 0.4): Good capacity for more work
- •Moderate (0.4-0.7): Normal operation
- •High (> 0.7): May need to offload work
3. File Sync Status Analysis
Function: analyze_sync_status(local_path, remote_path, group)
Objective: Compare local files with remote versions
Inputs:
- •
local_path: Local directory to compare - •
remote_path: Remote directory path - •
group: Group to check
Outputs:
{
'local_files': 145,
'local_size': 12582912, # bytes
'hosts': [
{
'host': 'web-01',
'status': 'in_sync',
'files_match': 145,
'files_different': 0,
'missing_files': 0
},
{
'host': 'web-02',
'status': 'out_of_sync',
'files_match': 140,
'files_different': 3,
'missing_files': 2,
'details': ['config.json modified', 'index.html modified', ...]
}
],
'sync_percentage': 96.7,
'recommended_action': 'Push to web-02'
}
4. Network Latency Analysis
Function: analyze_network_latency(hosts=None)
Objective: Measure connection latency to hosts
Inputs:
- •
hosts(optional): Specific hosts to test (None = all)
Outputs:
{
'hosts': [
{'host': 'web-01', 'latency_ms': 15, 'status': 'excellent'},
{'host': 'web-02', 'latency_ms': 45, 'status': 'good'},
{'host': 'db-01', 'latency_ms': 150, 'status': 'fair'}
],
'avg_latency': 70,
'min_latency': 15,
'max_latency': 150,
'recommendations': [
{'host': 'db-01', 'issue': 'High latency', 'action': 'Check network path'}
]
}
Latency Classification:
- •Excellent (< 50ms): Ideal for interactive tasks
- •Good (50-100ms): Suitable for most operations
- •Fair (100-200ms): May impact interactive workflows
- •Poor (> 200ms): Investigate network issues
5. Comprehensive Infrastructure Report
Function: comprehensive_infrastructure_report(group=None)
Objective: One-stop function for complete infrastructure overview
Inputs:
- •
group(optional): Limit to specific group (None = all)
Outputs:
{
'report_timestamp': '2025-10-19T19:43:41Z',
'group': 'production', # or 'all'
'metrics': {
'availability': {...}, # from analyze_host_availability
'load_distribution': {...}, # from analyze_load_distribution
'network_latency': {...}, # from analyze_network_latency
'tailscale_status': {...} # from Tailscale integration
},
'summary': "Production infrastructure: 3/3 hosts online, avg load 45%, network latency 35ms",
'alerts': [
"⚠ web-02: High CPU usage (85%)",
"⚠ db-01: Elevated latency (150ms)"
],
'recommendations': [
"Consider rebalancing workload from web-02",
"Investigate network path to db-01"
],
'overall_health': 'good' # excellent | good | fair | poor
}
Overall Health Classification:
- •Excellent: All metrics green, no alerts
- •Good: Most metrics healthy, minor alerts
- •Fair: Some concerning metrics, action recommended
- •Poor: Critical issues, immediate action required
Error Handling
Connection Errors
Error: Cannot connect to host
Causes:
- •Host is offline
- •Tailscale not connected
- •SSH key missing/invalid
- •Firewall blocking connection
Handling:
try:
execute_on_host("web-01", "ls")
except ConnectionError as e:
# Try Tailscale ping first
if not check_connectivity("web-01"):
return {
'error': 'Host unreachable',
'suggestion': 'Check Tailscale connection',
'diagnostics': get_connection_diagnostics("web-01")
}
# Then check SSH
if not validate_ssh_connection("web-01"):
return {
'error': 'SSH authentication failed',
'suggestion': 'Check SSH keys: ssh-add -l'
}
Timeout Errors
Error: Operation timed out
Causes:
- •Command taking too long
- •Network latency
- •Host overloaded
Handling:
- •Automatic retry with exponential backoff (3 attempts)
- •Increase timeout for known slow operations
- •Fall back to alternative host if available
File Transfer Errors
Error: File sync failed
Causes:
- •Insufficient disk space
- •Permission denied
- •Path doesn't exist
Handling:
- •Pre-check disk space on target
- •Validate permissions before transfer
- •Create directories if needed
- •Partial transfer recovery
Validation Errors
Error: Invalid parameter
Examples:
- •Unknown host
- •Non-existent group
- •Invalid path
Handling:
- •Validate all inputs before execution
- •Provide suggestions for similar valid options
- •Clear error messages with corrective actions
Mandatory Validations
Before Any Operation
- •
Parameter Validation:
pythonhost = validate_host(host, valid_hosts=get_all_hosts()) group = validate_group(group, valid_groups=get_groups()) timeout = validate_timeout(timeout)
- •
Connection Validation:
pythonif not validate_host_reachable(host, timeout=5): raise ConnectionError(f"Host {host} is not reachable") - •
Path Validation (for file operations):
pythonif not validate_path_exists(local_path): raise ValueError(f"Path does not exist: {local_path}")
During Operation
- •Timeout Monitoring: Every operation has configurable timeout
- •Progress Tracking: Long operations show progress
- •Error Isolation: Failure on one host doesn't stop others
After Operation
- •
Result Validation:
pythonreport = validate_operation_result(result) if report.has_critical_issues(): raise OperationError(report.get_summary()) - •
State Verification: Confirm operation succeeded
- •
Logging: Record all operations for audit trail
Performance and Caching
Caching Strategy
Host Status Cache:
- •TTL: 60 seconds
- •Why: Host status doesn't change rapidly
- •Invalidation: Manual invalidate when connectivity changes
Load Metrics Cache:
- •TTL: 30 seconds
- •Why: Load changes frequently
- •Invalidation: Automatic on timeout
Group Configuration Cache:
- •TTL: 5 minutes
- •Why: Group membership rarely changes
- •Invalidation: Manual when groups modified
Performance Optimizations
- •
Parallel Execution:
- •Commands execute concurrently across hosts
- •ThreadPoolExecutor with max 10 workers
- •Prevents sequential bottleneck
- •
Connection Pooling:
- •Reuse SSH connections when possible
- •ControlMaster in SSH config
- •
Lazy Loading:
- •Only fetch data when needed
- •Don't load all host status unless required
- •
Progressive Results:
- •Stream results as they complete
- •Don't wait for slowest host
Automatic Detection Keywords
This skill automatically activates when you mention:
Hosts & Machines:
- •remote machine, remote host, remote server
- •Tailscale machine, Tailscale host
- •SSH host, SSH server
- •homelab, servers, cluster
Tools:
- •sshsync, ssh sync
- •Tailscale, tailnet
- •SSH, secure shell
Operations:
- •distribute workload, load balance
- •file sharing, file sync, file transfer
- •push files, pull files, sync files
- •remote execution, execute remotely
- •deploy to, deploy across
- •backup from, sync to
Groups:
- •host group, machine group, server group
- •production servers, development machines
- •web servers, database servers
- •staging environment, production environment
Actions:
- •check status, check connectivity
- •run on all, execute on group
- •sync across, distribute to
- •deploy to, push to, pull from
Monitoring:
- •which machines are online
- •host availability, server status
- •machine health, connectivity check
- •network status, latency check
Workflows:
- •multi-stage deployment
- •rolling restart, zero-downtime
- •backup workflow, sync workflow
- •distributed task, parallel execution
Usage Examples
Example 1: Check Host Status
User: "Which of my Tailscale machines are online?"
Agent Response:
Checking Tailscale network status... 🟢 homelab-1 (homelab) - Online - Tailscale: Connected IP: 100.64.1.10 | Latency: 15ms 🟢 homelab-2 (homelab) - Online - Tailscale: Connected IP: 100.64.1.11 | Latency: 18ms 🔴 dev-laptop (development) - Offline Last seen: 2 hours ago | Tailscale: Not connected 🟢 prod-web-01 (production, web-servers) - Online - Tailscale: Connected IP: 100.64.1.20 | Latency: 25ms 🟢 prod-db-01 (production, databases) - Online - Tailscale: Connected IP: 100.64.1.30 | Latency: 30ms Summary: 4/5 hosts online (80%) Average latency: 22ms
Example 2: Execute on Group
User: "Check disk space on all web servers"
Agent Response:
Executing on group 'web-servers': df -h /var/www 🔧 web-01: Filesystem: /dev/sda1 Size: 100G | Used: 45G | Available: 50G (45% used) Status: ✓ Healthy 🔧 web-02: Filesystem: /dev/sda1 Size: 100G | Used: 67G | Available: 28G (67% used) Status: ⚠ Warning - Above 60% 🔧 web-03: Filesystem: /dev/sda1 Size: 100G | Used: 52G | Available: 43G (52% used) Status: ✓ Healthy ⚠️ Alert: web-02 is above 60% disk usage Recommendation: Clear old logs or increase disk space
Example 3: Load-Balanced Execution
User: "Run this npm build on the machine with most available resources"
Agent Response:
Analyzing resource availability across 5 machines... Load Analysis: web-01: CPU 45%, Mem 60%, Disk 40% → Score: 0.49 (moderate) web-02: CPU 85%, Mem 70%, Disk 65% → Score: 0.75 (high load) web-03: CPU 20%, Mem 35%, Disk 30% → Score: 0.28 (low load) ✓ ✓ Selected: web-03 Reason: Lowest load score (0.28) Available: CPU 80%, Memory 65%, Disk 70% Executing on web-03: npm run build [Build output...] ✓ Build completed successfully Duration: 2m 15s Output: ./dist (145 files, 12.3 MB) Final load: CPU 65%, Mem 50% (still below average)
Example 4: File Sync
User: "Push my app directory to all production servers"
Agent Response:
📤 Syncing: ~/projects/myapp → /var/www/myapp Target group: production (3 hosts) Analyzing source... 145 files, 12.3 MB total Preview (dry-run): prod-web-01: 145 files → /var/www/myapp prod-web-02: 145 files → /var/www/myapp prod-db-01: Skipped (not in web-servers group) Checking target capacity... ✓ prod-web-01: 50 GB available ✓ prod-web-02: 28 GB available Executing push... ✓ prod-web-01: Synced 145 files in 8.2s Modified: 12 files Added: 3 files ✓ prod-web-02: Synced 145 files in 6.5s Modified: 12 files Added: 3 files Summary: 2/2 successful (290 files, 24.6 MB total)
Example 5: Multi-Stage Deployment
User: "Deploy this to staging, run tests, then push to production"
Agent Response:
🚀 Multi-Stage Deployment Workflow
Stage 1: Staging Deployment
📤 Pushing to staging-01...
✓ Code synced (145 files, 12.3 MB) - 8s
🔨 Running build...
$ npm run build
✓ Build completed - 2m 15s
🧪 Running tests...
$ npm test
✓ All tests passed (145/145) - 45s
Stage 2: Staging Validation
✓ Health endpoint: 200 OK
✓ Database migrations: Up to date
✓ Smoke tests: 12/12 passed
Stage 3: Production Deployment
Rolling deployment to 3 hosts...
⏳ prod-web-01:
✓ Code synced
✓ Build completed
✓ Service restarted gracefully
✓ Health check passed
Duration: 3m 10s
⏳ prod-web-02:
✓ Code synced
✓ Build completed
✓ Service restarted gracefully
✓ Health check passed
Duration: 3m 5s
⏳ prod-web-03:
✓ Code synced
✓ Build completed
✓ Service restarted gracefully
✓ Health check passed
Duration: 3m 12s
Stage 4: Production Verification
✓ All health checks passed (3/3)
✓ No errors in logs (5min window)
✓ Response times normal (avg 45ms)
✅ Deployment completed successfully!
Total duration: 12m 45s
Hosts updated: 3
Zero downtime: ✓
Example 6: Comprehensive Infrastructure Report
User: "Give me a complete status report of my infrastructure"
Agent Response:
📊 Infrastructure Report Generated: 2025-10-19 19:43:41 === AVAILABILITY === Total Hosts: 10 Online: 8 (80%) Offline: 2 By Group: production: 3/3 (100%) ✓ development: 2/3 (66.7%) ⚠ homelab: 3/4 (75%) Offline Hosts: • dev-laptop (development) - Last seen: 2h ago • homelab-4 (homelab) - Last seen: 1d ago === LOAD DISTRIBUTION === Average Load: 45% CPU, 55% Memory, 45% Disk Top 3 Loaded Hosts: 1. web-02: 85% CPU, 70% Mem, 65% Disk (Score: 0.75) ⚠ 2. db-01: 60% CPU, 75% Mem, 55% Disk (Score: 0.65) 3. web-01: 45% CPU, 60% Mem, 40% Disk (Score: 0.49) Top 3 Available Hosts: 1. web-03: 20% CPU, 35% Mem, 30% Disk (Score: 0.28) ✓ 2. homelab-1: 25% CPU, 40% Mem, 35% Disk (Score: 0.33) 3. homelab-2: 30% CPU, 45% Mem, 40% Disk (Score: 0.38) === NETWORK LATENCY === Average: 35ms Range: 15ms - 150ms Excellent (< 50ms): 6 hosts Good (50-100ms): 1 host Fair (100-200ms): 1 host (db-01: 150ms) ⚠ === TAILSCALE STATUS === Network: Connected Peers Online: 8/10 Exit Node: None MagicDNS: Enabled === ALERTS === ⚠ web-02: High CPU usage (85%) - Consider load balancing ⚠ db-01: Elevated latency (150ms) - Check network path ⚠ dev-laptop: Offline for 2 hours - May need attention === RECOMMENDATIONS === 1. Rebalance workload from web-02 to web-03 2. Investigate network latency to db-01 3. Check status of dev-laptop and homelab-4 4. Consider scheduling maintenance for web-02 Overall Health: GOOD ✓
Installation
See INSTALLATION.md for detailed setup instructions.
Quick start:
# 1. Install sshsync pip install sshsync # 2. Configure SSH hosts vim ~/.ssh/config # 3. Sync host groups sshsync sync # 4. Install agent /plugin marketplace add ./tailscale-sshsync-agent # 5. Test "Which of my machines are online?"
Version
Current version: 1.0.0
See CHANGELOG.md for release history.
Architecture Decisions
See DECISIONS.md for detailed rationale behind tool selection, architecture choices, and trade-offs considered.