AgentSkillsCN

qa-verification

QA 验证技能,如同人类 QA 一般,在浏览器中逐页浏览应用界面,仔细观察屏幕上的每一处细节,并对任何看似异常之处发出警示。当您需要验证已实现的功能是否符合任务规格、测试用户流程,或对照 docs/TASKS.md 中的验收标准进行校验时,可使用此技能。触发条件包括:“验证此 PR”、“对这一功能进行 QA”、“视觉测试”、“检查 UI”、“截图测试”、“验证任务 X”。此技能适用于本地开发环境、预览/暂存环境,以及线上生产环境。系统会自动选择 Google 账户,仅在基于凭证的登录流程中暂停等待用户协助。

SKILL.md
--- frontmatter
name: qa-verification
description: QA verification skill that tests like a human QA would — navigating the app in a browser, looking at what's on screen, and flagging anything that looks wrong. Use when verifying implemented features match task specifications, testing user flows, or validating acceptance criteria from docs/TASKS.md. Triggers on "verify this PR", "QA this feature", "visual test", "check the UI", "screenshot test", "verify task X". Works against localhost, preview/staging deployments, and production URLs. Auto-selects Google accounts; pauses for user assistance only on credential-based login flows.

QA Verification

Test like a human QA — navigate the app, look at the screen, and flag what looks wrong.

Core Rules

  1. Screenshots are your eyes. Every verification judgment must come from looking at a screenshot. If you can't see it in the screenshot, you can't claim it passes or fails.

  2. browser_snapshot is for interaction only. You need snapshot refs to click buttons and fill forms — that's fine. But NEVER use snapshot/DOM data to verify acceptance criteria. A human QA doesn't open DevTools to check if a element exists; they look at the screen.

  3. Flag visual problems even if they aren't in the acceptance criteria. A human QA notices when a modal is cut off, text overflows its container, buttons overlap, or layout looks broken. You should too. Report these as additional findings separate from acceptance criteria results.

  4. Navigate like a user. Click links, fill forms, wait for pages to load. Don't skip steps or assume state.

What to Look For

Beyond acceptance criteria, flag anything a human would notice:

  • Layout issues — elements overlapping, content cut off at viewport edges, modals extending off-screen, unexpected scrollbars
  • Text problems — truncated labels, text overflowing containers, unreadable contrast, placeholder text still showing
  • Broken interactions — buttons that don't appear clickable, missing hover/focus states, forms that don't respond
  • Loading/state issues — spinners that never resolve, flash of unstyled content, empty states where data should be
  • Visual inconsistencies — misaligned elements, inconsistent spacing, elements that look out of place

Workflow

1. Parse Task Specification

Read the task file from docs/tasks/task-<id>.md or docs/TASKS.md. Extract:

  • Acceptance criteria: Specific conditions to verify
  • User flows: Step-by-step interactions to walk through

If user specifies a PR number, use gh pr view <number> to identify affected files and infer the relevant task.

2. Gather Test Context

Before starting, collect:

ItemSourceDefault
Target URLUser-providedhttp://localhost:3000 (also works with preview/staging/production URLs)
Task/PRUser-specifiedAsk user
Auth required?Task spec or inferenceAssume no
Screenshot dirUser preference./qa-screenshots/

3. Verification Loop

For each acceptance criterion:

code
1. SCREENSHOT the current state
2. LOOK at the screenshot — does everything look right?
3. ACT — click, type, navigate (use browser_snapshot only to get element refs)
4. WAIT for the page to settle
5. SCREENSHOT the result
6. LOOK at the screenshot — did the expected thing happen? Does anything look off?
7. RECORD the result with screenshot references

At every screenshot, ask yourself: "If I were a human looking at this screen, would anything catch my eye as wrong?" Flag it even if it's unrelated to the current criterion.

4. Authentication Handling

Google Account Selection — handle automatically:

If the page shows a Google account picker ("Choose an account", list of emails), click the first account and continue. Do NOT ask the user for help.

code
1. browser_take_screenshot to see the auth screen
2. If it looks like a Google account picker, browser_snapshot to get the ref
3. browser_click the first account
4. browser_wait_for(time=2) for redirect
5. browser_take_screenshot to confirm auth completed
6. Continue verification

Credential-based Login — defer to user:

If the page has a username/password form, pause and ask:

markdown
## Auth Required

I've encountered a login screen at [URL].

**Screenshot:** auth-required.png

**Options:**
1. **Manual login**: I'll wait while you log in via the browser, then continue
2. **Skip auth flows**: Mark auth-required tests as SKIPPED
3. **Provide credentials**: Share test credentials to proceed

5. Screenshot Strategy

WhenFilename PatternWhy
Before any action{criterion}-before-{action}.pngBaseline
After action completes{criterion}-after-{action}.pngVerify result
Something looks wrong{criterion}-FAIL-{desc}.pngEvidence
Visual issue unrelated to criteriavisual-issue-{desc}.pngAdditional finding
  • Use fullPage: true for layout verification
  • Use element screenshots for component-level detail
  • PNG format

6. Generate Report

markdown
# QA Report

**Task:** [task ID or PR number]
**URL:** [target URL]
**Date:** [timestamp]
**Screenshots:** [directory path]

## Summary
- Passed: X
- Failed: Y
- Skipped: Z
- Visual Issues: N (not in acceptance criteria but worth noting)

## Acceptance Criteria Results

### 1. [Criterion]
**Status:** PASS / FAIL / SKIP
**Steps:**
1. [What you did + screenshot ref]
2. [What you saw + screenshot ref]
**Notes:** [What you verified visually]

## Additional Visual Issues

### [Issue description]
**Screenshot:** [ref]
**Severity:** Minor / Major
**Details:** [What looks wrong and where]

Playwright Tools

For interaction (getting refs, clicking, typing):

  • browser_snapshot — get element refs so you can interact. NOT for verification.
  • browser_click, browser_type, browser_fill_form, browser_select_option — interact with the page
  • browser_navigate — go to URLs
  • browser_wait_for — wait for page updates

For verification (looking at the screen):

  • browser_take_screenshot — this is how you see. Use it constantly.

Example Session

User: "Verify task 3.1 against localhost:5173"

  1. Read docs/tasks/task-3.1.md, extract acceptance criteria
  2. browser_navigate to http://localhost:5173
  3. browser_take_screenshot — look at the landing state, note anything off
  4. For each criterion:
    • Screenshot before
    • browser_snapshot to get refs, then interact
    • Wait for result
    • Screenshot after
    • Look at the screenshot — does it pass? Anything else look wrong?
    • Record result
  5. If auth encountered: auto-select Google account or pause for credential login
  6. Generate report with all screenshots, criteria results, and any additional visual issues found