Run Tests

Execute the test plan from a PR and fix failures.

Input

PR number: $ARGUMENTS

Steps

1. Get PR Details

bash

gh pr view $ARGUMENTS --json body,headRefName --jq '.body'

2. Check Out the PR Branch

bash

gh pr checkout $ARGUMENTS

3. Extract Test Plan

Find the "Test Plan" section in the PR body and extract all - [ ] checkbox items.

4. Detect Project Type

Determine the project type by checking package.json, requirements.txt, or other config files. Classify as: web app with UI, API-only, CLI tool, or library.

5. Execute Each Test

Actually run each test item. Do not simulate:

•Unit/integration tests: Run the project's test suite (pytest, npm test, vitest, etc.)
•API endpoints: Start servers, make real HTTP requests with curl, verify responses
•CLI tools: Run actual commands, check output and exit codes
•
Web apps with UI: Use Playwright in headless mode to verify browser behavior:
- •Install Playwright if needed: npm install -D playwright && npx playwright install chromium
- •Start the dev server in background
- •Navigate pages, click buttons, fill forms, verify DOM state
- •Check rendering, interactions, navigation, error states
- •Stop the server and clean up
•Database: Query and verify schema/data

Browser-based UI tests must NOT be skipped. Use Playwright headless to verify them.

6. Report Results

Output in this format:

code

TEST_RESULTS_START
1. PASS | <description>
2. FAIL | <description> | <reason>
3. SKIP | <description> | <reason>
TEST_RESULTS_END

7. Fix Failures (if any)

If tests failed:

•Analyze and fix the implementation code (not test expectations)
•Commit: git add -A && git commit -m "Fix failing tests"
•Push: git push
•Re-run failing tests
•Retry up to 2 times

8. Update PR Body

Update the PR checkboxes with pass/fail status using gh pr edit.

Output

Report how many tests passed, failed, and were skipped.