Tailscale SSH Sync Agent

When to Use This Skill

This skill automatically activates when you need to:

✅ Distribute workloads across multiple machines

•"Run this on my least loaded machine"
•"Execute this task on the machine with most resources"
•"Balance work across my Tailscale network"

✅ Share files between Tailscale-connected hosts

•"Push this directory to all my development machines"
•"Sync code across my homelab servers"
•"Deploy configuration to production group"

✅ Execute commands remotely across host groups

•"Run system updates on all servers"
•"Check disk space across web-servers group"
•"Restart services on database hosts"

✅ Monitor machine availability and health

•"Which machines are online?"
•"Show status of my Tailscale network"
•"Check connectivity to remote hosts"

✅ Automate multi-machine workflows

•"Deploy to staging, test, then production"
•"Backup files from all machines"
•"Synchronize development environment across laptops"

How It Works

This agent provides intelligent workload distribution and file sharing management across Tailscale SSH-connected machines using the sshsync CLI tool.

Core Architecture:

•SSH Sync Wrapper: Python interface to sshsync CLI operations
•Tailscale Manager: Tailscale-specific connectivity and status management
•Load Balancer: Intelligent task distribution based on machine resources
•Workflow Executor: Common multi-machine workflow automation
•Validators: Parameter, host, and connection validation
•Helpers: Temporal context, formatting, and utilities

Key Features:

•Automatic host discovery via Tailscale and SSH config
•Intelligent load balancing based on CPU, memory, and current load
•Group-based operations (execute on all web servers, databases, etc.)
•Dry-run mode for preview before execution
•Parallel execution across multiple hosts
•Comprehensive error handling and retry logic
•Connection validation before operations
•Progress tracking for long-running operations

Data Sources

sshsync CLI Tool

What is sshsync?

sshsync is a Python CLI tool for managing SSH connections and executing operations across multiple hosts. It provides:

•Group-based host management
•Remote command execution with timeouts
•File push/pull operations (single or recursive)
•Integration with existing SSH config (~/.ssh/config)
•Status checking and connectivity validation

Installation:

bash

pip install sshsync

Configuration:

sshsync uses two configuration sources:

•SSH Config (~/.ssh/config): Host connection details
•sshsync Config (~/.config/sshsync/config.yaml): Group assignments

Example SSH Config:

code

Host homelab-1
  HostName 100.64.1.10
  User admin
  IdentityFile ~/.ssh/id_ed25519

Host prod-web-01
  HostName 100.64.1.20
  User deploy
  Port 22

Example sshsync Config:

yaml

groups:
  homelab:
    - homelab-1
    - homelab-2
  production:
    - prod-web-01
    - prod-web-02
    - prod-db-01
  development:
    - dev-laptop
    - dev-desktop

sshsync Commands Used:

Command	Purpose	Example
`sshsync all`	Execute on all hosts	`sshsync all "df -h"`
`sshsync group`	Execute on group	`sshsync group web "systemctl status nginx"`
`sshsync push`	Push files to hosts	`sshsync push --group prod ./app /var/www/`
`sshsync pull`	Pull files from hosts	`sshsync pull --host db /var/log/mysql ./logs/`
`sshsync ls`	List hosts	`sshsync ls --with-status`
`sshsync sync`	Sync ungrouped hosts	`sshsync sync`

Tailscale Integration

What is Tailscale?

Tailscale is a zero-config VPN that creates a secure network between your devices. It provides:

•Automatic peer-to-peer connections via WireGuard
•Magic DNS for easy host addressing (e.g., machine-name.tailnet-name.ts.net)
•SSH capabilities built-in to Tailscale CLI
•ACLs for access control

Tailscale SSH:

Tailscale includes SSH functionality that works seamlessly with standard SSH:

bash

# Standard SSH via Tailscale
ssh user@machine-name

# Tailscale-specific SSH command
tailscale ssh machine-name

Integration with sshsync:

Since Tailscale SSH uses standard SSH protocol, it works perfectly with sshsync. Just configure your SSH config with Tailscale hostnames:

code

Host homelab-1
  HostName homelab-1.tailnet.ts.net
  User admin

Tailscale Commands Used:

Command	Purpose	Example
`tailscale status`	Show network status	Lists all connected machines
`tailscale ping`	Check connectivity	`tailscale ping machine-name`
`tailscale ssh`	SSH to machine	`tailscale ssh user@machine`

Workflows

1. Host Health Monitoring

User Query: "Which of my machines are online?"

Workflow:

•Load SSH config and sshsync groups
•Execute sshsync ls --with-status
•Parse connectivity results
•Query Tailscale status for additional context
•
Return formatted health report with:
- •Online/offline status per host
- •Group memberships
- •Tailscale connection state
- •Last seen timestamp

Implementation: scripts/sshsync_wrapper.py → get_host_status()

Output Format:

code

🟢 homelab-1 (homelab) - Online - Tailscale: Connected
🟢 prod-web-01 (production, web-servers) - Online - Tailscale: Connected
🔴 dev-laptop (development) - Offline - Last seen: 2h ago
🟢 prod-db-01 (production, databases) - Online - Tailscale: Connected

Summary: 3/4 hosts online (75%)

2. Intelligent Load Balancing

User Query: "Run this task on the least loaded machine"

Workflow:

•Get list of candidate hosts (from group or all)
•
For each online host, check:
- •CPU load (via uptime or top)
- •Memory usage (via free or vm_stat)
- •Disk space (via df)
•Calculate composite load score
•Select host with lowest score
•Execute task on selected host
•Return result with performance metrics

Implementation: scripts/load_balancer.py → select_optimal_host()

Load Score Calculation:

code

score = (cpu_pct * 0.4) + (mem_pct * 0.3) + (disk_pct * 0.3)

Lower score = better candidate for task execution.

Output Format:

code

✓ Selected host: prod-web-02
  Reason: Lowest load score (0.32)
  - CPU: 15% (vs avg 45%)
  - Memory: 30% (vs avg 60%)
  - Disk: 40% (vs avg 55%)

Executing: npm run build
[Task output...]

✓ Completed in 2m 15s

3. File Synchronization Workflows

User Query: "Sync my code to all development machines"

Workflow:

•Validate source path exists locally
•Identify target group ("development")
•Check connectivity to all group members
•Show dry-run preview (files to be synced, sizes)
•Execute parallel push to all hosts
•Validate successful transfer on each host
•Return summary with per-host status

Implementation: scripts/sshsync_wrapper.py → push_to_group()

Supported Operations:

•Push to all: Sync files to every configured host
•Push to group: Sync to specific group (dev, prod, etc.)
•Pull from host: Retrieve files from single host
•Pull from group: Collect files from multiple hosts
•Recursive sync: Entire directory trees with --recurse

Output Format:

code

📤 Syncing: ~/projects/myapp → /var/www/myapp
Group: development (3 hosts)

Preview (dry-run):
  - dev-laptop: 145 files, 12.3 MB
  - dev-desktop: 145 files, 12.3 MB
  - dev-server: 145 files, 12.3 MB

Execute? [Proceeding...]

✓ dev-laptop: Synced 145 files in 8s
✓ dev-desktop: Synced 145 files in 6s
✓ dev-server: Synced 145 files in 10s

Summary: 3/3 successful (435 files, 36.9 MB total)

4. Remote Command Orchestration

User Query: "Check disk space on all web servers"

Workflow:

•Identify target group ("web-servers")
•Validate group exists and has members
•Check connectivity to group members
•Execute command in parallel across group
•Collect and parse outputs
•Format results with per-host breakdown

Implementation: scripts/sshsync_wrapper.py → execute_on_group()

Features:

•Parallel execution: Commands run simultaneously on all hosts
•Timeout handling: Configurable per-command timeout (default 10s)
•Error isolation: Failure on one host doesn't stop others
•Output aggregation: Collect and correlate all outputs
•Dry-run mode: Preview what would execute without running

Output Format:

code

🔧 Executing on group 'web-servers': df -h /var/www

web-01:
  Filesystem: /dev/sda1
  Size: 100G, Used: 45G, Available: 50G (45% used)

web-02:
  Filesystem: /dev/sda1
  Size: 100G, Used: 67G, Available: 28G (67% used) ⚠️

web-03:
  Filesystem: /dev/sda1
  Size: 100G, Used: 52G, Available: 43G (52% used)

⚠️ Alert: web-02 is above 60% disk usage

5. Multi-Stage Deployment Workflow

User Query: "Deploy to staging, test, then production"

Workflow:

•
Stage 1 - Staging Deploy:
- •Push code to staging group
- •Run build process
- •Execute automated tests
- •If tests fail: STOP and report error
•
Stage 2 - Validation:
- •Check staging health endpoints
- •Validate database migrations
- •Run smoke tests
•
Stage 3 - Production Deploy:
- •Push to production group (one at a time for zero-downtime)
- •Restart services gracefully
- •Verify each host before proceeding to next
•
Stage 4 - Verification:
- •Check production health
- •Monitor for errors
- •Rollback if issues detected

Implementation: scripts/workflow_executor.py → deploy_workflow()

Output Format:

code

🚀 Multi-Stage Deployment Workflow

Stage 1: Staging Deployment
  ✓ Pushed code to staging-01
  ✓ Build completed (2m 15s)
  ✓ Tests passed (145/145)

Stage 2: Validation
  ✓ Health check passed
  ✓ Database migration OK
  ✓ Smoke tests passed (12/12)

Stage 3: Production Deployment
  ✓ prod-web-01: Deployed & verified
  ✓ prod-web-02: Deployed & verified
  ✓ prod-web-03: Deployed & verified

Stage 4: Verification
  ✓ All health checks passed
  ✓ No errors in logs (5min window)

✅ Deployment completed successfully in 12m 45s

Available Scripts

scripts/sshsync_wrapper.py

Purpose: Python wrapper around sshsync CLI for programmatic access

Functions:

•get_host_status(group=None): Get online/offline status of hosts
•execute_on_all(command, timeout=10, dry_run=False): Run command on all hosts
•execute_on_group(group, command, timeout=10, dry_run=False): Run on specific group
•execute_on_host(host, command, timeout=10): Run on single host
•push_to_hosts(local_path, remote_path, hosts=None, group=None, recurse=False, dry_run=False): Push files
•pull_from_host(host, remote_path, local_path, recurse=False, dry_run=False): Pull files
•list_hosts(with_status=True): List all configured hosts
•get_groups(): Get all defined groups and their members
•add_hosts_to_group(group, hosts): Add hosts to a group

Usage Example:

python

from sshsync_wrapper import execute_on_group, push_to_hosts

# Execute command
result = execute_on_group(
    group="web-servers",
    command="systemctl status nginx",
    timeout=15
)

# Push files
push_to_hosts(
    local_path="./dist",
    remote_path="/var/www/app",
    group="production",
    recurse=True
)

scripts/tailscale_manager.py

Purpose: Tailscale-specific operations and status management

Functions:

•get_tailscale_status(): Get Tailscale network status (all peers)
•check_connectivity(host): Ping host via Tailscale
•get_peer_info(hostname): Get detailed info about peer
•list_online_machines(): List all online Tailscale machines
•get_machine_ip(hostname): Get Tailscale IP for machine
•validate_tailscale_ssh(host): Check if Tailscale SSH is working

Usage Example:

python

from tailscale_manager import get_tailscale_status, check_connectivity

# Get network status
status = get_tailscale_status()
print(f"Online machines: {status['online_count']}")

# Check specific host
is_online = check_connectivity("homelab-1")

scripts/load_balancer.py

Purpose: Intelligent task distribution based on machine resources

Functions:

•get_machine_load(host): Get CPU, memory, disk metrics
•calculate_load_score(metrics): Calculate composite load score
•select_optimal_host(candidates, prefer_group=None): Pick best host
•get_group_capacity(): Get aggregate capacity of group
•distribute_tasks(tasks, hosts): Distribute multiple tasks optimally

Usage Example:

python

from load_balancer import select_optimal_host

# Find best machine for task
best_host = select_optimal_host(
    candidates=["web-01", "web-02", "web-03"],
    prefer_group="production"
)

# Execute on selected host
execute_on_host(best_host, "npm run build")

scripts/workflow_executor.py

Purpose: Common multi-machine workflow automation

Functions:

•deploy_workflow(code_path, staging_group, prod_group): Full deployment pipeline
•backup_workflow(hosts, backup_paths, destination): Backup from multiple hosts
•sync_workflow(source_host, target_group, paths): Sync from one to many
•rolling_restart(group, service_name): Zero-downtime service restart
•health_check_workflow(group, endpoint): Check health across group

Usage Example:

python

from workflow_executor import deploy_workflow, backup_workflow

# Deploy with testing
deploy_workflow(
    code_path="./dist",
    staging_group="staging",
    prod_group="production"
)

# Backup from all databases
backup_workflow(
    hosts=["db-01", "db-02"],
    backup_paths=["/var/lib/mysql"],
    destination="./backups"
)

scripts/utils/helpers.py

Purpose: Common utilities and formatting functions

Functions:

•format_bytes(bytes): Human-readable byte formatting (1.2 GB)
•format_duration(seconds): Human-readable duration (2m 15s)
•parse_ssh_config(): Parse ~/.ssh/config for host details
•parse_sshsync_config(): Parse sshsync group configuration
•get_timestamp(): Get ISO timestamp for logging
•safe_execute(func, *args, **kwargs): Execute with error handling
•validate_path(path): Check if path exists and is accessible

scripts/utils/validators/parameter_validator.py

Purpose: Validate user inputs and parameters

Functions:

•validate_host(host, valid_hosts=None): Validate host exists
•validate_group(group, valid_groups=None): Validate group exists
•validate_path_exists(path): Check local path exists
•validate_timeout(timeout): Ensure timeout is reasonable
•validate_command(command): Basic command safety validation

scripts/utils/validators/host_validator.py

Purpose: Validate host configuration and availability

Functions:

•validate_ssh_config(host): Check host has SSH config entry
•validate_host_reachable(host, timeout=5): Check host is reachable
•validate_group_members(group): Ensure group has valid members
•get_invalid_hosts(hosts): Find hosts without valid config

scripts/utils/validators/connection_validator.py

Purpose: Validate SSH and Tailscale connections

Functions:

•validate_ssh_connection(host): Test SSH connection works
•validate_tailscale_connection(host): Test Tailscale connectivity
•validate_ssh_key(host): Check SSH key authentication
•get_connection_diagnostics(host): Comprehensive connection testing

Available Analyses

1. Host Availability Analysis

Function: analyze_host_availability(group=None)

Objective: Determine which machines are online and accessible

Inputs:

•group (optional): Specific group to check (None = all hosts)

Outputs:

python

{
    'total_hosts': 10,
    'online_hosts': 8,
    'offline_hosts': 2,
    'availability_pct': 80.0,
    'by_group': {
        'production': {'online': 3, 'total': 3, 'pct': 100.0},
        'development': {'online': 2, 'total': 3, 'pct': 66.7},
        'homelab': {'online': 3, 'total': 4, 'pct': 75.0}
    },
    'offline_hosts_details': [
        {'host': 'dev-laptop', 'last_seen': '2h ago', 'groups': ['development']},
        {'host': 'homelab-4', 'last_seen': '1d ago', 'groups': ['homelab']}
    ]
}

Interpretation:

•> 90%: Excellent availability
•70-90%: Good availability, monitor offline hosts
•< 70%: Poor availability, investigate issues

2. Load Distribution Analysis

Function: analyze_load_distribution(group=None)

Objective: Understand resource usage across machines

Inputs:

•group (optional): Specific group to analyze

Outputs:

python

{
    'hosts': [
        {
            'host': 'web-01',
            'cpu_pct': 45,
            'mem_pct': 60,
            'disk_pct': 40,
            'load_score': 0.49,
            'status': 'moderate'
        },
        # ... more hosts
    ],
    'aggregate': {
        'avg_cpu': 35,
        'avg_mem': 55,
        'avg_disk': 45,
        'total_capacity': 1200  # GB
    },
    'recommendations': [
        {
            'host': 'web-02',
            'issue': 'High CPU usage (85%)',
            'action': 'Consider migrating workloads'
        }
    ]
}

Load Status:

•Low (score < 0.4): Good capacity for more work
•Moderate (0.4-0.7): Normal operation
•High (> 0.7): May need to offload work

3. File Sync Status Analysis

Function: analyze_sync_status(local_path, remote_path, group)

Objective: Compare local files with remote versions

Inputs:

•local_path: Local directory to compare
•remote_path: Remote directory path
•group: Group to check

Outputs:

python

{
    'local_files': 145,
    'local_size': 12582912,  # bytes
    'hosts': [
        {
            'host': 'web-01',
            'status': 'in_sync',
            'files_match': 145,
            'files_different': 0,
            'missing_files': 0
        },
        {
            'host': 'web-02',
            'status': 'out_of_sync',
            'files_match': 140,
            'files_different': 3,
            'missing_files': 2,
            'details': ['config.json modified', 'index.html modified', ...]
        }
    ],
    'sync_percentage': 96.7,
    'recommended_action': 'Push to web-02'
}

4. Network Latency Analysis

Function: analyze_network_latency(hosts=None)

Objective: Measure connection latency to hosts

Inputs:

•hosts (optional): Specific hosts to test (None = all)

Outputs:

python

{
    'hosts': [
        {'host': 'web-01', 'latency_ms': 15, 'status': 'excellent'},
        {'host': 'web-02', 'latency_ms': 45, 'status': 'good'},
        {'host': 'db-01', 'latency_ms': 150, 'status': 'fair'}
    ],
    'avg_latency': 70,
    'min_latency': 15,
    'max_latency': 150,
    'recommendations': [
        {'host': 'db-01', 'issue': 'High latency', 'action': 'Check network path'}
    ]
}

Latency Classification:

•Excellent (< 50ms): Ideal for interactive tasks
•Good (50-100ms): Suitable for most operations
•Fair (100-200ms): May impact interactive workflows
•Poor (> 200ms): Investigate network issues

5. Comprehensive Infrastructure Report

Function: comprehensive_infrastructure_report(group=None)

Objective: One-stop function for complete infrastructure overview

Inputs:

•group (optional): Limit to specific group (None = all)

Outputs:

python

{
    'report_timestamp': '2025-10-19T19:43:41Z',
    'group': 'production',  # or 'all'
    'metrics': {
        'availability': {...},  # from analyze_host_availability
        'load_distribution': {...},  # from analyze_load_distribution
        'network_latency': {...},  # from analyze_network_latency
        'tailscale_status': {...}  # from Tailscale integration
    },
    'summary': "Production infrastructure: 3/3 hosts online, avg load 45%, network latency 35ms",
    'alerts': [
        "⚠ web-02: High CPU usage (85%)",
        "⚠ db-01: Elevated latency (150ms)"
    ],
    'recommendations': [
        "Consider rebalancing workload from web-02",
        "Investigate network path to db-01"
    ],
    'overall_health': 'good'  # excellent | good | fair | poor
}

Overall Health Classification:

•Excellent: All metrics green, no alerts
•Good: Most metrics healthy, minor alerts
•Fair: Some concerning metrics, action recommended
•Poor: Critical issues, immediate action required

Error Handling

Connection Errors

Error: Cannot connect to host

Causes:

•Host is offline
•Tailscale not connected
•SSH key missing/invalid
•Firewall blocking connection

Handling:

python

try:
    execute_on_host("web-01", "ls")
except ConnectionError as e:
    # Try Tailscale ping first
    if not check_connectivity("web-01"):
        return {
            'error': 'Host unreachable',
            'suggestion': 'Check Tailscale connection',
            'diagnostics': get_connection_diagnostics("web-01")
        }
    # Then check SSH
    if not validate_ssh_connection("web-01"):
        return {
            'error': 'SSH authentication failed',
            'suggestion': 'Check SSH keys: ssh-add -l'
        }

Timeout Errors

Error: Operation timed out

Causes:

•Command taking too long
•Network latency
•Host overloaded

Handling:

•Automatic retry with exponential backoff (3 attempts)
•Increase timeout for known slow operations
•Fall back to alternative host if available

File Transfer Errors

Error: File sync failed

Causes:

•Insufficient disk space
•Permission denied
•Path doesn't exist

Handling:

•Pre-check disk space on target
•Validate permissions before transfer
•Create directories if needed
•Partial transfer recovery

Validation Errors

Error: Invalid parameter

Examples:

•Unknown host
•Non-existent group
•Invalid path

Handling:

•Validate all inputs before execution
•Provide suggestions for similar valid options
•Clear error messages with corrective actions

Mandatory Validations

Before Any Operation

•

Parameter Validation:

python

host = validate_host(host, valid_hosts=get_all_hosts())
group = validate_group(group, valid_groups=get_groups())
timeout = validate_timeout(timeout)

•

Connection Validation:

python

if not validate_host_reachable(host, timeout=5):
    raise ConnectionError(f"Host {host} is not reachable")

•

Path Validation (for file operations):

python

if not validate_path_exists(local_path):
    raise ValueError(f"Path does not exist: {local_path}")

During Operation

•Timeout Monitoring: Every operation has configurable timeout
•Progress Tracking: Long operations show progress
•Error Isolation: Failure on one host doesn't stop others

After Operation

•

Result Validation:

python

report = validate_operation_result(result)
if report.has_critical_issues():
    raise OperationError(report.get_summary())

•
State Verification: Confirm operation succeeded
•
Logging: Record all operations for audit trail

Performance and Caching

Caching Strategy

Host Status Cache:

•TTL: 60 seconds
•Why: Host status doesn't change rapidly
•Invalidation: Manual invalidate when connectivity changes

Load Metrics Cache:

•TTL: 30 seconds
•Why: Load changes frequently
•Invalidation: Automatic on timeout

Group Configuration Cache:

•TTL: 5 minutes
•Why: Group membership rarely changes
•Invalidation: Manual when groups modified

Performance Optimizations

•
Parallel Execution:
- •Commands execute concurrently across hosts
- •ThreadPoolExecutor with max 10 workers
- •Prevents sequential bottleneck
•
Connection Pooling:
- •Reuse SSH connections when possible
- •ControlMaster in SSH config
•
Lazy Loading:
- •Only fetch data when needed
- •Don't load all host status unless required
•
Progressive Results:
- •Stream results as they complete
- •Don't wait for slowest host

Automatic Detection Keywords

This skill automatically activates when you mention:

Hosts & Machines:

•remote machine, remote host, remote server
•Tailscale machine, Tailscale host
•SSH host, SSH server
•homelab, servers, cluster

Tools:

•sshsync, ssh sync
•Tailscale, tailnet
•SSH, secure shell

Operations:

•distribute workload, load balance
•file sharing, file sync, file transfer
•push files, pull files, sync files
•remote execution, execute remotely
•deploy to, deploy across
•backup from, sync to

Groups:

•host group, machine group, server group
•production servers, development machines
•web servers, database servers
•staging environment, production environment

Actions:

•check status, check connectivity
•run on all, execute on group
•sync across, distribute to
•deploy to, push to, pull from

Monitoring:

•which machines are online
•host availability, server status
•machine health, connectivity check
•network status, latency check

Workflows:

•multi-stage deployment
•rolling restart, zero-downtime
•backup workflow, sync workflow
•distributed task, parallel execution

Usage Examples

Example 1: Check Host Status

User: "Which of my Tailscale machines are online?"

Agent Response:

code

Checking Tailscale network status...

🟢 homelab-1 (homelab) - Online - Tailscale: Connected
  IP: 100.64.1.10 | Latency: 15ms

🟢 homelab-2 (homelab) - Online - Tailscale: Connected
  IP: 100.64.1.11 | Latency: 18ms

🔴 dev-laptop (development) - Offline
  Last seen: 2 hours ago | Tailscale: Not connected

🟢 prod-web-01 (production, web-servers) - Online - Tailscale: Connected
  IP: 100.64.1.20 | Latency: 25ms

🟢 prod-db-01 (production, databases) - Online - Tailscale: Connected
  IP: 100.64.1.30 | Latency: 30ms

Summary: 4/5 hosts online (80%)
Average latency: 22ms

Example 2: Execute on Group

User: "Check disk space on all web servers"

Agent Response:

code

Executing on group 'web-servers': df -h /var/www

🔧 web-01:
  Filesystem: /dev/sda1
  Size: 100G | Used: 45G | Available: 50G (45% used)
  Status: ✓ Healthy

🔧 web-02:
  Filesystem: /dev/sda1
  Size: 100G | Used: 67G | Available: 28G (67% used)
  Status: ⚠ Warning - Above 60%

🔧 web-03:
  Filesystem: /dev/sda1
  Size: 100G | Used: 52G | Available: 43G (52% used)
  Status: ✓ Healthy

⚠️ Alert: web-02 is above 60% disk usage
Recommendation: Clear old logs or increase disk space

Example 3: Load-Balanced Execution

User: "Run this npm build on the machine with most available resources"

Agent Response:

code

Analyzing resource availability across 5 machines...

Load Analysis:
  web-01: CPU 45%, Mem 60%, Disk 40% → Score: 0.49 (moderate)
  web-02: CPU 85%, Mem 70%, Disk 65% → Score: 0.75 (high load)
  web-03: CPU 20%, Mem 35%, Disk 30% → Score: 0.28 (low load) ✓

✓ Selected: web-03
  Reason: Lowest load score (0.28)
  Available: CPU 80%, Memory 65%, Disk 70%

Executing on web-03: npm run build

[Build output...]
  ✓ Build completed successfully
  Duration: 2m 15s
  Output: ./dist (145 files, 12.3 MB)

Final load: CPU 65%, Mem 50% (still below average)

Example 4: File Sync

User: "Push my app directory to all production servers"

Agent Response:

code

📤 Syncing: ~/projects/myapp → /var/www/myapp
Target group: production (3 hosts)

Analyzing source...
  145 files, 12.3 MB total

Preview (dry-run):
  prod-web-01: 145 files → /var/www/myapp
  prod-web-02: 145 files → /var/www/myapp
  prod-db-01: Skipped (not in web-servers group)

Checking target capacity...
  ✓ prod-web-01: 50 GB available
  ✓ prod-web-02: 28 GB available

Executing push...

✓ prod-web-01: Synced 145 files in 8.2s
  Modified: 12 files
  Added: 3 files

✓ prod-web-02: Synced 145 files in 6.5s
  Modified: 12 files
  Added: 3 files

Summary: 2/2 successful (290 files, 24.6 MB total)

Example 5: Multi-Stage Deployment

User: "Deploy this to staging, run tests, then push to production"

Agent Response:

code

🚀 Multi-Stage Deployment Workflow

Stage 1: Staging Deployment
  📤 Pushing to staging-01...
    ✓ Code synced (145 files, 12.3 MB) - 8s

  🔨 Running build...
    $ npm run build
    ✓ Build completed - 2m 15s

  🧪 Running tests...
    $ npm test
    ✓ All tests passed (145/145) - 45s

Stage 2: Staging Validation
  ✓ Health endpoint: 200 OK
  ✓ Database migrations: Up to date
  ✓ Smoke tests: 12/12 passed

Stage 3: Production Deployment
  Rolling deployment to 3 hosts...

  ⏳ prod-web-01:
    ✓ Code synced
    ✓ Build completed
    ✓ Service restarted gracefully
    ✓ Health check passed
    Duration: 3m 10s

  ⏳ prod-web-02:
    ✓ Code synced
    ✓ Build completed
    ✓ Service restarted gracefully
    ✓ Health check passed
    Duration: 3m 5s

  ⏳ prod-web-03:
    ✓ Code synced
    ✓ Build completed
    ✓ Service restarted gracefully
    ✓ Health check passed
    Duration: 3m 12s

Stage 4: Production Verification
  ✓ All health checks passed (3/3)
  ✓ No errors in logs (5min window)
  ✓ Response times normal (avg 45ms)

✅ Deployment completed successfully!
  Total duration: 12m 45s
  Hosts updated: 3
  Zero downtime: ✓

Example 6: Comprehensive Infrastructure Report

User: "Give me a complete status report of my infrastructure"

Agent Response:

code

📊 Infrastructure Report
Generated: 2025-10-19 19:43:41

=== AVAILABILITY ===
Total Hosts: 10
Online: 8 (80%)
Offline: 2

By Group:
  production: 3/3 (100%) ✓
  development: 2/3 (66.7%) ⚠
  homelab: 3/4 (75%)

Offline Hosts:
  • dev-laptop (development) - Last seen: 2h ago
  • homelab-4 (homelab) - Last seen: 1d ago

=== LOAD DISTRIBUTION ===
Average Load: 45% CPU, 55% Memory, 45% Disk

Top 3 Loaded Hosts:
  1. web-02: 85% CPU, 70% Mem, 65% Disk (Score: 0.75) ⚠
  2. db-01: 60% CPU, 75% Mem, 55% Disk (Score: 0.65)
  3. web-01: 45% CPU, 60% Mem, 40% Disk (Score: 0.49)

Top 3 Available Hosts:
  1. web-03: 20% CPU, 35% Mem, 30% Disk (Score: 0.28) ✓
  2. homelab-1: 25% CPU, 40% Mem, 35% Disk (Score: 0.33)
  3. homelab-2: 30% CPU, 45% Mem, 40% Disk (Score: 0.38)

=== NETWORK LATENCY ===
Average: 35ms
Range: 15ms - 150ms

Excellent (< 50ms): 6 hosts
Good (50-100ms): 1 host
Fair (100-200ms): 1 host (db-01: 150ms) ⚠

=== TAILSCALE STATUS ===
Network: Connected
Peers Online: 8/10
Exit Node: None
MagicDNS: Enabled

=== ALERTS ===
⚠ web-02: High CPU usage (85%) - Consider load balancing
⚠ db-01: Elevated latency (150ms) - Check network path
⚠ dev-laptop: Offline for 2 hours - May need attention

=== RECOMMENDATIONS ===
1. Rebalance workload from web-02 to web-03
2. Investigate network latency to db-01
3. Check status of dev-laptop and homelab-4
4. Consider scheduling maintenance for web-02

Overall Health: GOOD ✓

Installation

See INSTALLATION.md for detailed setup instructions.

Quick start:

bash

# 1. Install sshsync
pip install sshsync

# 2. Configure SSH hosts
vim ~/.ssh/config

# 3. Sync host groups
sshsync sync

# 4. Install agent
/plugin marketplace add ./tailscale-sshsync-agent

# 5. Test
"Which of my machines are online?"

Version

Current version: 1.0.0

See CHANGELOG.md for release history.

Architecture Decisions

See DECISIONS.md for detailed rationale behind tool selection, architecture choices, and trade-offs considered.