AgentSkillsCN

check-upstream-flake

检查失败的测试是否为 Chromium LUCI 分析数据库中已知的上游不稳定测试。提供不稳定性统计数据、判定结果以及筛选文件决策的建议。可通过诸如“检查上游不稳定测试”、“此测试是否为上游不稳定测试”、“检查 LUCI 不稳定性”、“上游不稳定测试检查”等短语触发。

SKILL.md
--- frontmatter
name: check-upstream-flake
description: "Check if a failing test is a known upstream flake in Chromium's LUCI Analysis database. Provides flakiness statistics, verdict, and recommendation for filter file decisions. Triggers on: check upstream flake, is this test flaky upstream, check luci flakiness, upstream flake check."
argument-hint: <TestSuite.TestMethod>

Check Upstream Flake

Check if a failing test is a known upstream flake in the Chromium LUCI Analysis database. This queries the REST API at analysis.api.luci.app to retrieve historical pass/fail/flake data for a test in the Chromium CI infrastructure.


When to Use

  • Investigating intermittent test failures before deciding on a fix approach
  • Evaluating test disable PRs to verify upstream flakiness claims
  • During PR review (via the review skill) when assessing test filter changes
  • Working on "pending" stories that involve Chromium test failures

The Job

When invoked with a test name:

  1. Search for matching test IDs in the Chromium LUCI Analysis database
  2. Retrieve flakiness statistics for each match over the lookback period
  3. Analyze pass/fail/flake rates
  4. Report a verdict and recommendation

Usage

bash
# Basic: check a specific test (default 30-day lookback)
python3 ./scripts/check-upstream-flake.py "WebUIURLLoaderFactoryTest.RangeRequest"

# Longer lookback window
python3 ./scripts/check-upstream-flake.py "WebUIURLLoaderFactoryTest.RangeRequest" --days 60

# JSON output (for programmatic use)
python3 ./scripts/check-upstream-flake.py "WebUIURLLoaderFactoryTest.RangeRequest" --json

# Search by test class name (finds all methods)
python3 ./scripts/check-upstream-flake.py "WebUIURLLoaderFactoryTest"

Arguments:

  • test_name (required): Test name or substring to search for
  • --days N: Lookback window in days (default: 30, max: 90)
  • --json: Output JSON instead of markdown

Exit codes:

  • 0: Success (results found and reported)
  • 1: Error (network, API, etc.)
  • 2: No matching test IDs found

Interpreting Results

The script produces one of five verdicts:

VerdictFlake RateAction
Known upstream flake>= 5%Safe to add to filter file. Document upstream flakiness in the filter comment.
Occasional upstream failures1-5%Consider filtering. Document findings. May still warrant investigation.
Stable upstream< 1%Investigate Brave-specific causes. The test is stable in Chromium, so Brave code changes are likely causing the failure.
Insufficient dataN/A (<10 verdicts)Cannot determine from upstream data. Manual investigation needed.
Not foundN/ATest not in Chromium database. May be Brave-specific or use a different ID format.

Flake rate is calculated as (failed + flaky) / (passed + failed + flaky). Skipped and precluded verdicts are excluded from the rate.


How Results Inform Decisions

Known upstream flake or occasional failures

  • Disabling via filter file is appropriate
  • Use the most specific filter file possible (platform/sanitizer-specific)
  • Include in the filter comment: "Known upstream flake (X% flake rate over N days per LUCI Analysis)"
  • Reference this in commit message and PR body

Stable upstream

  • The test passes reliably in Chromium CI
  • Focus investigation on Brave-specific factors:
    • Check brave/chromium_src/ overrides in related directories
    • Look for Brave features that change timing or behavior
    • Check if Brave adds UI elements that affect the test
  • A filter disable should be a last resort and needs strong justification

Not found or insufficient data


API Details

The script uses the LUCI Analysis REST API (pRPC protocol):

  • QueryTests: POST https://analysis.api.luci.app/prpc/luci.analysis.v1.TestHistory/QueryTests
  • QueryStats: POST https://analysis.api.luci.app/prpc/luci.analysis.v1.TestHistory/QueryStats
  • Query (fallback): POST https://analysis.api.luci.app/prpc/luci.analysis.v1.TestHistory/Query

No authentication is required for public Chromium data.

Test IDs in LUCI follow the format: ninja://{gn_path}:{target}/{TestSuite}.{TestMethod}


Limitations

  • Only covers Chromium upstream data (not Brave CI)
  • Test ID format may not match for all tests
  • Historical data limited to ~90 days
  • Does not compare failure logs/output (only counts pass/fail/flake)
  • Cannot distinguish between different failure modes for the same test