Analysis Workflow

From the Grafana Slack alert, extract:

Use list_log_streams.py to find the log stream that ended just before the restart (OOM).

bash

python scripts/list_log_streams.py \
  --pod-name <full-pod-name> \
  --region ap-northeast-2

Selection Criteria:

•Look for the stream whose Last time is closest to the OOM/restart time.
•If streams are split, the one ending before the restart contains the relevant logs.

Use find_incomplete_requests.py to identify requests that started but never completed in the selected stream.

bash

python scripts/find_incomplete_requests.py \
  --log-stream '<full-stream-name-from-step-2>' \
  --minutes-before 5

This script fetches logs from CloudWatch and shows which API endpoints had incomplete requests.

Use query_alb_access_logs.py to check request and response sizes via Athena for suspicious paths found in Step 3.

bash

python scripts/query_alb_access_logs.py \
  --path '<suspicious-path>' \
  --oom-time 'YYYY-MM-DDTHH:MM:SS' \
  --minutes-before 10

Summarize the abnormal characteristics (Timing, Payloads, Duration, Concurrency) to assist the developer.

Output Format