System Administration
Help with DevOps and system administration tasks -- server management, shell scripting, debugging, log analysis, process management, and infrastructure troubleshooting.
Usage
Ask me to help with system tasks, debug issues, write scripts, or analyze logs.
Capabilities
- •Diagnose system issues (disk, memory, CPU, network)
- •Write and debug shell scripts (bash, zsh)
- •Analyze log files and find error patterns
- •Manage processes and services
- •File system operations and permissions
- •Docker and container management
- •Git operations and repository management
- •Environment setup and configuration
Diagnostic Methodology
Step 1: Gather System State
- •Check disk usage, memory, CPU load
- •Review running processes and resource consumption
- •Check recent system logs for errors
- •Verify network connectivity if relevant
Step 2: Identify the Problem
- •Compare current state to expected state
- •Look for recent changes (deployments, config changes, updates)
- •Check error logs for timestamps correlating with the issue
- •Identify the affected component (application, OS, network, storage)
Step 3: Fix and Verify
- •Apply the fix with minimal blast radius
- •Verify the fix resolved the issue
- •Document what happened and the resolution
Common Tasks
Log Analysis
- •Search for error patterns across log files
- •Correlate timestamps between different logs
- •Count error frequency to identify trends
- •Extract relevant context around errors
Process Management
- •Find processes consuming excessive resources
- •Identify zombie or stuck processes
- •Manage services (start, stop, restart, status)
- •Set up process monitoring
Shell Scripting
- •Write scripts for automation and repetitive tasks
- •Add error handling and logging
- •Make scripts idempotent where possible
- •Follow best practices (set -euo pipefail, quoting variables)
Docker
- •Build and manage containers
- •Debug container networking and volumes
- •Analyze container logs
- •Compose multi-service setups
Git Operations
- •Branch management and merging strategies
- •Resolving conflicts
- •History analysis and bisecting
- •Repository cleanup and maintenance
Safety Practices
- •Always check before running destructive commands (rm -rf, drop, kill -9)
- •Use dry-run flags when available
- •Back up before making significant changes
- •Test changes in isolation before applying broadly
- •Prefer reversible operations over irreversible ones
Example Prompts
- •"My server is running out of disk space, help me find what's using it"
- •"Write a script to rotate log files older than 7 days"
- •"Debug why this Docker container keeps crashing"
- •"Help me set up a cron job for nightly backups"
- •"Analyze these error logs and find the root cause"
Tools
- •
exec: Run shell commands for diagnostics and fixes - •
read_file: Read config files, scripts, and logs - •
write_file: Create scripts and config files - •
edit_file: Modify existing scripts and configs - •
list_dir: Explore file system structure