Run Tests
Execute the test plan from a PR and fix failures.
Input
PR number: $ARGUMENTS
Steps
1. Get PR Details
bash
gh pr view $ARGUMENTS --json body,headRefName --jq '.body'
2. Check Out the PR Branch
bash
gh pr checkout $ARGUMENTS
3. Extract Test Plan
Find the "Test Plan" section in the PR body and extract all - [ ] checkbox items.
4. Detect Project Type
Determine the project type by checking package.json, requirements.txt, or other config files. Classify as: web app with UI, API-only, CLI tool, or library.
5. Execute Each Test
Actually run each test item. Do not simulate:
- •Unit/integration tests: Run the project's test suite (pytest, npm test, vitest, etc.)
- •API endpoints: Start servers, make real HTTP requests with curl, verify responses
- •CLI tools: Run actual commands, check output and exit codes
- •Web apps with UI: Use Playwright in headless mode to verify browser behavior:
- •Install Playwright if needed:
npm install -D playwright && npx playwright install chromium - •Start the dev server in background
- •Navigate pages, click buttons, fill forms, verify DOM state
- •Check rendering, interactions, navigation, error states
- •Stop the server and clean up
- •Install Playwright if needed:
- •Database: Query and verify schema/data
Browser-based UI tests must NOT be skipped. Use Playwright headless to verify them.
6. Report Results
Output in this format:
code
TEST_RESULTS_START 1. PASS | <description> 2. FAIL | <description> | <reason> 3. SKIP | <description> | <reason> TEST_RESULTS_END
7. Fix Failures (if any)
If tests failed:
- •Analyze and fix the implementation code (not test expectations)
- •Commit:
git add -A && git commit -m "Fix failing tests" - •Push:
git push - •Re-run failing tests
- •Retry up to 2 times
8. Update PR Body
Update the PR checkboxes with pass/fail status using gh pr edit.
Output
Report how many tests passed, failed, and were skipped.