HarshJudge E2E Testing

Name: harshjudge
Rating: 62
Author: HuskyDanny

AI-native E2E testing with MCP tools and visual evidence capture.

Core Principles

•Evidence First: Screenshot before and after every action
•Fail Fast: Stop on error, report with context
•Complete Runs: Always call completeRun, even on failure
•Step Isolation: Each step executes in its own spawned agent for token efficiency
•Knowledge Accumulation: Learnings go to prd.md, not scenarios

Step-Based Execution

HarshJudge uses a step-based agent pattern for token-efficient test execution:

code

Main Agent                    Step Agents (spawned per step)
    │
    ├─ startRun(scenarioSlug)
    │      ↓
    │  Returns: runId, steps[]
    │
    ├─► Spawn Agent: Step 01 ──────────────────────► Execute actions
    │      │                                              │
    │      │ ◄─────────────────────────────────── Return: { status, evidencePaths }
    │      │
    │   completeStep(runId, "01", status)
    │      │
    ├─► Spawn Agent: Step 02 ──────────────────────► Execute actions
    │      │                                              │
    │      │ ◄─────────────────────────────────── Return: { status, evidencePaths }
    │      │
    │   completeStep(runId, "02", status)
    │      │
    │   ... (repeat for each step)
    │
    └─ completeRun(runId, finalStatus)

Benefits:

•Each step agent has isolated context (no token accumulation)
•Large outputs (screenshots, logs) saved to files, not returned
•Main agent only receives concise summaries
•Automatic token optimization without manual management

Workflows

Intent	Reference	Key Tools
Initialize project	references/setup.md	`initProject`
Create scenario	references/create.md	`createScenario`
Run scenario	references/run.md	`startRun`, `completeStep`, `completeRun`
Fix failed test	references/iterate.md	`getStatus`, `createScenario`
Check status	references/status.md	`getStatus`

Project Structure

code

.harshJudge/
  config.yaml              # Project configuration
  prd.md                   # Product requirements (from assets/prd.md template)
  scenarios/{slug}/
    meta.yaml              # Scenario definition + run statistics
    steps/                 # Individual step files
      01-step-slug.md      # Step 01 details
      02-step-slug.md      # Step 02 details
      ...
    runs/{runId}/          # Run history
      result.json          # Run result with per-step data
      step-01/evidence/    # Step 01 evidence
      step-02/evidence/    # Step 02 evidence
      ...
  snapshots/               # Inspection tool outputs (token-saving pattern)

Quick Reference

HarshJudge MCP Tools

Tool	Purpose
`initProject`	Initialize project (spawns dashboard)
`createScenario`	Create/update scenario with step files
`toggleStar`	Toggle/set scenario starred status
`startRun`	Start test run, returns step list
`recordEvidence`	Capture evidence for a step
`completeStep`	Complete a step, get next step ID
`completeRun`	Finalize run with status
`getStatus`	Check project or scenario status
`openDashboard` / `closeDashboard`	Manage dashboard server

Playwright MCP Tools

Tool	Purpose
`browser_navigate`	Navigate to URL
`browser_snapshot`	Get accessibility tree (use before click/type)
`browser_click`	Click element using ref
`browser_type`	Type into input using ref
`browser_take_screenshot`	Capture screenshot for evidence
`browser_console_messages`	Get console logs
`browser_network_requests`	Get network activity
`browser_wait_for`	Wait for text/condition

Step Agent Prompt Template

When spawning an agent for each step:

code

Execute step {stepId} of scenario {scenarioSlug}:

## Step Content
{content from steps/{stepId}-{slug}.md}

## Project Context
Base URL: {from config.yaml}
Auth: {from prd.md if needed}

## Previous Step
Status: {pass|fail|first step}

## Your Task
1. Execute the actions using Playwright MCP tools
2. Use browser_snapshot before clicking to get element refs
3. Capture before/after screenshots using browser_take_screenshot
4. Record evidence using recordEvidence with step={stepNumber}

Return ONLY a JSON object:
{
  "status": "pass" | "fail",
  "evidencePaths": ["path1.png", "path2.png"],
  "error": null | "error message"
}

DO NOT return full evidence content. DO NOT explain your work.

Error Handling

On ANY error:

•STOP - Do not proceed
•Report - Tool, params, error, resolution
•Check prd.md - Is this a known pattern?
•Do NOT retry - Unless user instructs