AgentSkillsCN

beam-remote

当您需要通过 AWS ECS 的 exec 命令连接到正在运行的 Elixir/BEAM 节点,以进行实时检查与调试时,此功能便大显身手。当用户发出“连接到节点”、“查看当前运行的进程”、“检查内存使用情况”、“远程控制台”、“ECS exec”、“检查节点状态”等指令,或提出任何实时 BEAM 调试需求时,此功能均可快速响应。

SKILL.md
--- frontmatter
name: beam-remote
description: Use when needing to connect to a running Elixir/BEAM node via AWS ECS exec for live inspection and debugging. Triggers on "connect to the node", "check running processes", "inspect memory", "remote console", "ECS exec", "check the node", or any live BEAM debugging request.

Live BEAM Node Introspection

Connect to running Elixir/BEAM nodes via AWS ECS exec for read-only inspection and debugging.

SAFETY RULES (READ FIRST)

Default mode: READ-ONLY. All inspection commands are safe. Mutations require explicit user confirmation.

CategoryExamplesPermission
SAFE (read-only):erlang.memory(), :recon.proc_count/2, Process.info/2, :sys.get_state/1, Ecto selects, :ets.info/1, :ets.tab2list/1Auto-allowed
CAUTION (side effects):erlang.garbage_collect/1, :sys.trace/2, :erlang.system_flag/2Warn user, proceed if acknowledged
DANGEROUS (mutations)Repo.insert/update/delete, GenServer.cast/2, :init.stop/0, Process.exit/2BLOCK — require explicit user confirmation before running

See references/safety-guardrails.md for full classification and confirmation gate templates.

Arguments

/beam-remote <env>

ArgumentValuesNotes
<env>Environment name (e.g., qa, uat, staging)Adapt to your AWS profile names

Connection Workflow

1. Authenticate to AWS

bash
# Check if already authenticated
aws sts get-caller-identity --profile <env>

# If expired:
aws sso login --profile <env>

2. Find Running Tasks

bash
aws ecs list-tasks \
  --cluster <CLUSTER_NAME> \
  --service-name <SERVICE_NAME> \
  --profile <env>

3. Connect to Container

bash
aws ecs execute-command \
  --cluster <CLUSTER_NAME> \
  --task <TASK_ID> \
  --container <CONTAINER_NAME> \
  --interactive \
  --command "/bin/bash" \
  --profile <env>

4. Attach to Elixir Node

bash
/app/bin/<app_name> remote

To exit: Ctrl+C twice (safe disconnect). Never use :init.stop() — it kills the node.

5. Verify BEAM Configuration

bash
# Before attaching, check VM args
ps aux | grep beam
# Look for: -S 4:4 (schedulers), -SDcpu 4, -SDio 10, -sname

Quick Health Check

Paste this block for a comprehensive snapshot (requires :recon):

elixir
alias MyApp.Repo
import Ecto.Query

health = %{
  timestamp: DateTime.utc_now(),
  memory: :erlang.memory() |> Enum.map(fn {k, v} -> {k, Float.round(v / 1_048_576, 2)} end) |> Enum.into(%{}),
  process_count: length(Process.list()),
  process_limit: :erlang.system_info(:process_limit),
  port_count: length(Port.list()),
  port_limit: :erlang.system_info(:port_limit),
  ets_table_count: length(:ets.all()),
  top_memory_pids: :recon.proc_count(:memory, 5) |> Enum.map(fn {pid, mem, info} -> %{pid: inspect(pid), memory_mb: Float.round(mem / 1_048_576, 2), info: info} end),
  top_mailbox_pids: :recon.proc_count(:message_queue_len, 5) |> Enum.map(fn {pid, len, info} -> %{pid: inspect(pid), queue_len: len, info: info} end),
  oban_available: Repo.one(from j in Oban.Job, where: j.state == "available", select: count(j.id)),
  oban_executing: Repo.one(from j in Oban.Job, where: j.state == "executing", select: count(j.id)),
  oban_discarded: Repo.one(from j in Oban.Job, where: j.state == "discarded", select: count(j.id))
}

IO.inspect(health, pretty: true, limit: :infinity)

Investigation Routing

code
What are you investigating?
├── Memory issues → :recon.proc_count(:memory, N), :recon.bin_leak(N)
├── Process issues → :recon.proc_count(:message_queue_len, N), Process.info/2
├── Scheduler saturation → :recon.scheduler_usage(5000)
├── ETS growth → ETS table analysis snippet
├── GenServer state → :sys.get_state/1
└── Full health check → Quick health check block above

For detailed investigation playbooks, see references/investigation-playbooks.md.

Common One-Liners

elixir
# Memory overview (MB)
:erlang.memory() |> Enum.map(fn {k, v} -> {k, Float.round(v / 1_048_576, 2)} end)

# Top 10 processes by memory
:recon.proc_count(:memory, 10)

# Top 10 processes by mailbox length
:recon.proc_count(:message_queue_len, 10)

# Binary memory leak detection
:recon.bin_leak(10)

# Scheduler utilization (5-second sample)
:recon.scheduler_usage(5000)

# Is the system overloaded?
:erlang.system_info(:process_count) > 100_000 or :erlang.memory(:total) > 14_000_000_000

# Process count by initial call (find spawners)
Process.list()
|> Enum.map(&Process.info(&1, :initial_call))
|> Enum.frequencies()
|> Enum.sort_by(&elem(&1, 1), :desc)
|> Enum.take(10)

Troubleshooting Connection

ErrorCauseFix
AccessDeniedExceptionIAM role lacks ecs:ExecuteCommandUse profile with write/exec access
Session Manager plugin not foundMissing AWS pluginInstall from AWS docs
Container not respondingContainer OOM or unhealthyTry other task, check aws ecs describe-tasks
Remote console hangsNode overwhelmedTry another task in the HA pair