AgentSkillsCN

Sysadmin

系统管理员

SKILL.md

System Administration

Help with DevOps and system administration tasks -- server management, shell scripting, debugging, log analysis, process management, and infrastructure troubleshooting.

Usage

Ask me to help with system tasks, debug issues, write scripts, or analyze logs.

Capabilities

  • Diagnose system issues (disk, memory, CPU, network)
  • Write and debug shell scripts (bash, zsh)
  • Analyze log files and find error patterns
  • Manage processes and services
  • File system operations and permissions
  • Docker and container management
  • Git operations and repository management
  • Environment setup and configuration

Diagnostic Methodology

Step 1: Gather System State

  • Check disk usage, memory, CPU load
  • Review running processes and resource consumption
  • Check recent system logs for errors
  • Verify network connectivity if relevant

Step 2: Identify the Problem

  • Compare current state to expected state
  • Look for recent changes (deployments, config changes, updates)
  • Check error logs for timestamps correlating with the issue
  • Identify the affected component (application, OS, network, storage)

Step 3: Fix and Verify

  • Apply the fix with minimal blast radius
  • Verify the fix resolved the issue
  • Document what happened and the resolution

Common Tasks

Log Analysis

  • Search for error patterns across log files
  • Correlate timestamps between different logs
  • Count error frequency to identify trends
  • Extract relevant context around errors

Process Management

  • Find processes consuming excessive resources
  • Identify zombie or stuck processes
  • Manage services (start, stop, restart, status)
  • Set up process monitoring

Shell Scripting

  • Write scripts for automation and repetitive tasks
  • Add error handling and logging
  • Make scripts idempotent where possible
  • Follow best practices (set -euo pipefail, quoting variables)

Docker

  • Build and manage containers
  • Debug container networking and volumes
  • Analyze container logs
  • Compose multi-service setups

Git Operations

  • Branch management and merging strategies
  • Resolving conflicts
  • History analysis and bisecting
  • Repository cleanup and maintenance

Safety Practices

  • Always check before running destructive commands (rm -rf, drop, kill -9)
  • Use dry-run flags when available
  • Back up before making significant changes
  • Test changes in isolation before applying broadly
  • Prefer reversible operations over irreversible ones

Example Prompts

  • "My server is running out of disk space, help me find what's using it"
  • "Write a script to rotate log files older than 7 days"
  • "Debug why this Docker container keeps crashing"
  • "Help me set up a cron job for nightly backups"
  • "Analyze these error logs and find the root cause"

Tools

  • exec: Run shell commands for diagnostics and fixes
  • read_file: Read config files, scripts, and logs
  • write_file: Create scripts and config files
  • edit_file: Modify existing scripts and configs
  • list_dir: Explore file system structure