GitHub CI Fix Analyzer
Diagnose failing GitHub PR checks by inspecting Actions runs, extracting log snippets, and recommending targeted fixes.
Overview
This skill analyzes failing CI checks on GitHub pull requests to identify root causes and guide fixes. It uses the gh CLI to inspect check statuses, fetch workflow run logs, and extract actionable failure snippets. The pipeline handles gh field drift across CLI versions, falls back to job-level log fetching when run logs are unavailable, and scopes external CI providers out of the analysis.
What it analyzes:
- •PR check statuses (pass, fail, pending, cancelled, timed out)
- •GitHub Actions workflow run logs and job-level logs
- •Failure markers (error, traceback, assertion, panic, timeout, segfault)
- •Run metadata (workflow name, conclusion, branch, SHA, event trigger)
What it reports:
- •Failing check names with run URLs and log snippets
- •Root cause classification (build, test, lint, timeout, infrastructure, permission)
- •Fix recommendations prioritized by severity
- •External check URLs (Buildkite, CircleCI) without deep inspection
Analysis Checklist
Prerequisites
- •
ghCLI is installed and on PATH - •
gh auth statussucceeds withrepoandworkflowscopes - • Working directory is inside a git repository
- • PR exists (explicit number/URL or current branch has an open PR)
Check Inspection
- • Fetch all PR checks via
gh pr checks <pr> --json - • Handle field drift: retry with available fields if primary set is rejected
- • Filter to failure/error/cancelled/timed_out conclusions
- • Classify each check as GitHub Actions or external provider
- • For external checks, record details URL only
Log Retrieval
- • Fetch run metadata via
gh run view <run_id> --json - • Fetch run logs via
gh run view <run_id> --log - • Fall back to job log via
gh api /repos/{owner}/{repo}/actions/jobs/{job_id}/logs - • Handle zip payloads from job log API (report as unavailable)
- • Extract failure snippet: scan for last failure marker, take context window
Diagnosis
- • Summarize each failing check with name, URL, and log snippet
- • Classify failure type (build, test, lint, timeout, infrastructure, permission)
- • Identify the specific file and line when available
- • Cross-reference failure with recent PR diff
- • Present fix plan and wait for explicit user approval
Metrics
Failure Classification
| Category | Log Markers | Common Causes |
|---|---|---|
| Build error | error, fatal, cannot find module | Missing dependency, syntax error, type error |
| Test failure | FAIL, AssertionError, expected ... got | Regression, flaky test, missing fixture |
| Lint violation | warning, rule codes (E501, W503) | Style drift, new linter rule |
| Timeout | timeout, timed_out, cancelled | Slow test, infinite loop, resource starvation |
| Infrastructure | rate limit, connection refused, 503 | Runner capacity, network issue, outage |
| Permission | 403, permission denied, EACCES | Missing secret, token scope, file perms |
Severity Levels
| Level | Criteria | Action |
|---|---|---|
| Critical | All checks failing, merge blocked | Fix immediately |
| High | Required check failing | Fix before merge |
| Medium | Optional check failing | Fix if straightforward |
| Low | Warning-only, non-blocking | Address in follow-up |
Workflow
Step 1: Verify gh Authentication
gh auth status # If unauthenticated: gh auth login (ensure repo + workflow scopes)
Step 2: Resolve the Pull Request
# Current branch PR gh pr view --json number,url # Explicit PR gh pr view 123 --json number,url
Step 3: Run the Inspection Script
python3 "<skill-directory>/scripts/inspect_pr_checks.py" \ --repo "." --pr "<number-or-url>"
Flags: --json (machine output), --max-lines 200 (snippet window, default 160), --context 40 (context lines, default 30).
Manual fallback if the script is unavailable:
gh pr checks <pr> --json name,state,conclusion,detailsUrl,startedAt,completedAt gh run view <run_id> --json conclusion,status,workflowName,name,event,headBranch,headSha,url gh run view <run_id> --log
Step 4: Scope External Checks
If detailsUrl lacks /actions/runs/, classify the check as external. Report name and URL only. Do not fetch logs from Buildkite, CircleCI, or other providers.
Step 5: Summarize Failures
For each failing check report: check name, workflow, run URL, failure category, log snippet, and missing-log note if applicable.
Step 6: Draft Fix Plan
- •Identify root cause file and line from log
- •Cross-reference with PR diff to confirm if newly introduced
- •Draft concrete fix steps per failing check
- •Wait for explicit user approval before implementing
Step 7: Implement After Approval
- •Apply the fix (edit files, update config, add dependencies)
- •Summarize changes made
- •Suggest running relevant local tests before pushing
- •Offer to open or update the PR
Step 8: Recheck Status
python3 "<skill-directory>/scripts/inspect_pr_checks.py" --repo "." --pr "<number>" # Or: gh pr checks <pr>
Report Format
GitHub CI Analysis Report ========================= PR: #123 (feature/add-auth) Repository: org/repo Checks: 8 total, 2 failing, 6 passing Failing: 1. [CRITICAL] test-suite (failure) Workflow: CI / test-suite Run: https://github.com/org/repo/actions/runs/12345 Category: Test failure Snippet: AssertionError: expected 200 got 401 at test_auth.py:45 2. [MEDIUM] lint (failure) Workflow: CI / lint Run: https://github.com/org/repo/actions/runs/12346 Category: Lint violation Snippet: E501 line too long (120 > 88) at src/auth/handler.py:23 External (not analyzed): - Buildkite / deploy-preview: https://buildkite.com/org/pipeline/build/456
Examples
Example 1: Failing test suite
User: CI is red on my PR, can you investigate?
Assistant: I will check your PR checks now.
Assistant: I will check your PR checks now.
1. Running gh auth status... authenticated
2. Running inspect_pr_checks.py --repo . --pr 789
3. Found 1 failing check:
[HIGH] test-suite (failure)
Workflow: CI / test-suite
Run: https://github.com/org/repo/actions/runs/99001
Category: Test failure
Snippet:
tests/test_login.py:32
AssertionError: expected status 200 but got 403
Fixture missing read:user OAuth scope
Root cause: The PR adds a new OAuth scope requirement but the test
fixture in tests/conftest.py still uses the old scope set.
Fix plan:
1. Update tests/conftest.py mock_oauth_scopes to include read:user
2. Re-run pytest tests/test_login.py locally to confirm
Shall I apply this fix?
Example 2: External CI provider
User: Why is the Buildkite check failing on PR #456? Assistant: Checking PR #456 checks... The Buildkite check is an external provider (not GitHub Actions). I can only report its details URL: - deploy-preview: https://buildkite.com/org/pipeline/build/789 Open the Buildkite link directly to inspect the build logs. GitHub Actions checks on this PR are all passing.
Interpretation Guide
gh CLI Field Drift
The gh CLI may change available --json fields between versions. The bundled script handles this automatically by parsing the "Available fields:" error message and retrying with a compatible field set. The fallback fields are: name, state, bucket, link, startedAt, completedAt, workflow.
Log Retrieval Hierarchy
- •
gh run view <id> --log-- preferred, returns the full run log - •
gh api /repos/{owner}/{repo}/actions/jobs/{id}/logs-- fallback for in-progress or partial runs - •Run metadata only -- when both log sources fail, report metadata without snippet
Failure Marker Scanning
The script scans log lines in reverse order for these markers: error, fail, failed, traceback, exception, assert, panic, fatal, timeout, segmentation fault. It returns a context window (default 30 lines) around the last match, capped at --max-lines (default 160).
Notes
- •Always wait for user approval before implementing fixes.
- •Do not attempt to parse logs from external CI providers.
- •If
gh authfails, guide the user throughgh auth loginwith correct scopes. - •The bundled script exits non-zero when failures remain (usable in automation).
- •Merge commits and bot-triggered checks may produce different failure patterns.
- •For very large logs, increase
--max-linesand--contextto capture more context.