Buildkite Playwright failure extraction
Use this skill when you need a compact, human-readable overview of failing Playwright tests from Buildkite job logs. This workflow uses the Buildkite MCP server via mcporter and mcp-remote. Run /skill:buildkite-mcp first for setup.
Quick start (script)
bash
./scripts/extract-playwright-failures.cjs --org ORG --pipeline PIPELINE --build BUILD_NUMBER
Options:
- •
--server NAME(default:buildkite-remote) - •
--pattern REGEX(default:\b\d+ failed\b) - •
--search-limit N(default: 80) - •
--context N(default: 3) - •
--read-limit N(default: 120) - •
--error-for TEST_NAME(fetch error details for a single test) - •
--job-id ID(required with--error-for) - •
--job-url URL(optional with--error-for) - •
--failed-line-row N(optional with--error-for) - •
--error-search-limit N(default: 20) - •
--error-context N(default: 2) - •
--error-read-limit N(default: 200)
Output
The script always emits JSON and caches summary results under your temp dir ($TMPDIR/pi-buildkite-playwright-failures). It refreshes the cache if any Playwright job state changes.
Summary output includes:
- •
summary - •
jobs[]entries with:- •
jobName,jobId,jobUrl,label,state - •
failedLine,failedLineRow - •
tests(string list) - •
testEntries(objects withname+row) - •
error
- •
Error lookup output (--error-for) includes:
- •
error.testName,error.jobId,error.jobUrl - •
error.matchRow,error.startRow,error.endRow - •
error.lines(array of log lines) - •
meta.checkedAt
When presenting results to a human, format the summary with:
- •failed job count
- •unique failing tests count
- •tests failing in all environments (label as
all) - •tests failing in multiple environments
- •tests failing in only one environment
What the script does
- •Fetches the build with
detail_level: "full". - •Filters for failed Playwright jobs.
- •Searches logs backwards for the summary line (
X failed). - •Reads a small log slice around the summary line.
- •Captures only the failed test lines, stopping before flaky/skipped/passed sections.
Manual steps (if the script fails)
- •Fetch the build and collect job IDs
bash
mcporter call 'buildkite-remote.get_build(org_slug: "ORG", pipeline_slug: "PIPELINE", build_number: "BUILD_NUMBER", detail_level: "full")' --output json
- •Locate the failure summary block
bash
mcporter call 'buildkite-remote.search_logs(org_slug: "ORG", pipeline_slug: "PIPELINE", build_number: "BUILD_NUMBER", job_id: "JOB_ID", pattern: "\\b\\d+ failed\\b", reverse: true, limit: 80, context: 3)' --output json
- •Read only the failure list
bash
mcporter call 'buildkite-remote.read_logs(org_slug: "ORG", pipeline_slug: "PIPELINE", build_number: "BUILD_NUMBER", job_id: "JOB_ID", seek: SEEK, limit: 120)' --output json
- •Ignore flaky sections
- •Copy only the lines under the “X failed” section.
- •Skip any “flaky” or “skipped” sections that appear later.
Notes
- •Use
--output jsonso you can script the results. - •The summary block is usually near the end of the log.
- •If it isn’t found in the last ~1,000 lines, increase the seek/limit range and retry.