Playwright MCP Testing Guide
This skill provides a systematic checklist for visually testing and inspecting the SSL Attention Visualization app using Playwright MCP.
Workflow Instructions
IMPORTANT: Follow these instructions exactly:
- •
Use the Checklist System: Work through the structured checklists systematically. Do NOT skip items or test ad-hoc.
- •
Track Progress: Use your task/todo management tools to track which checklist items have been completed. Mark items as you complete them.
- •
Sequential Testing: Work through the phases in order (Phase 1 → Phase 2 → Phase 3 → etc.). Within each phase, complete all checklist items before moving to the next phase.
- •
Document Everything: For each checklist item:
- •Take a
browser_snapshotorbrowser_take_screenshotas evidence - •Note whether the item passed or failed
- •If failed, stop and follow the Bug Handling Workflow below
- •Take a
- •
Report Format: When reporting results, use this format:
code✅ [Item description] - PASSED ❌ [Item description] - FAILED: [brief description of issue]
Prerequisites
- •
Start the development servers:
bash./dev.sh
- •
Ensure Playwright MCP is configured in Claude Code
- •
App URLs:
- •Frontend: http://localhost:5173
- •Backend API: http://localhost:8000
- •API Docs: http://localhost:8000/docs
Bug Handling Workflow
CRITICAL: When any bug or visual error is detected during testing, follow this workflow exactly. Do NOT continue testing until the bug is fixed and verified.
- •STOP testing immediately - Do not continue to the next checklist item
- •Document the bug - Note the page, steps to reproduce, and expected vs actual behavior
- •Fix the bug - Make the necessary code changes
- •Restart the servers:
bash
# Kill existing servers pkill -f "uvicorn"; pkill -f "vite" # Restart ./dev.sh
- •Verify the fix with Playwright MCP:
- •Navigate back to the same page/state where the bug occurred
- •Confirm the bug is resolved
- •Take a screenshot or snapshot as evidence
- •Resume testing from where you left off
This ensures bugs are caught and fixed immediately rather than accumulating a backlog of issues.
Getting Started with Playwright MCP
Navigation Commands
# Navigate to the app
mcp__playwright__browser_navigate → url: "http://localhost:5173"
# Take a snapshot (preferred over screenshot for accessibility)
mcp__playwright__browser_snapshot
# Take a screenshot for visual inspection
mcp__playwright__browser_take_screenshot
# Click an element (use ref from snapshot)
mcp__playwright__browser_click → element: "description", ref: "e123"
# Run custom JavaScript for complex interactions
mcp__playwright__browser_run_code → code: "async (page) => { ... }"
Useful Patterns
// Check for element visibility
await page.locator('text=Some Text').isVisible();
// Wait for element
await page.waitForSelector('selector');
// Get element count
await page.locator('.grid-item').count();
// Check network requests
await page.waitForResponse(response => response.url().includes('/api/'));
Testing Workflow
Phase 1: Navigation & Layout
Initial Load
- • Page loads at http://localhost:5173 without console errors
- • "SSL Attention" heading is visible in navigation
- • Navigation bar renders with all links: Gallery, Compare, Dashboard
- • Footer is visible with correct text
Navigation Links
- • "Gallery" nav link is clickable and highlights when active
- • "Compare" nav link navigates to /compare
- • "Dashboard" nav link navigates to /dashboard
- • "SSL Attention" logo/title returns to home
Phase 2: Gallery Page (Home)
Layout
- • Gallery page loads at "/"
- • Page title or heading indicates gallery content
- • Image grid/list container is visible
Image Display
- • Images load and display correctly
- • Image thumbnails are appropriately sized
- • Loading states show while images fetch
- • No broken image placeholders
Interaction
- • Hovering over images shows visual feedback (if applicable)
- • Clicking an image navigates to /image/:imageId detail page
- • Pagination or infinite scroll works (if applicable)
API Integration
- • Network requests to backend succeed (check for /api/ calls)
- • Error states display gracefully if API fails
Phase 3: Image Detail Page
Navigation
- • Can reach page by clicking image in gallery
- • URL shows correct pattern: /image/:imageId
- • Back navigation returns to gallery
Layout
- • Three-column layout: annotations (left), visualization (center), controls (right)
- • Left column (2/10) shows Annotations panel with styles and bbox list
- • Center column (5/10) displays the main image with attention overlay
- • Right column (3/10) contains control panel and metrics
Annotations Panel (Left Column)
- • Architectural styles are listed as badges
- • Number of bounding boxes is shown
- • Scrollable list of bboxes with dimensions (% width x % height)
- • Clicking bbox in list selects it (when bboxes are shown)
- • Hint text "Click a bounding box to see feature similarity heatmap" appears when bboxes are shown
Attention Viewer (Center Column)
- • Main image displays with attention heatmap overlay
- • Overlay toggle button appears on hover (top-right of image)
- • Clicking toggle switches between attention overlay and original image
- • Info badges display: current model, layer number
- • Colormap legend is visible below the image
Control Panel (Right Column)
- • Model selector dropdown shows all 6 models (dinov2, dinov3, mae, clip, siglip2, resnet50)
- • Layer slider range adjusts per model (e.g., 0-11 for ViTs, 0-3 for ResNet-50)
- • Percentile threshold selector shows options: 90%, 85%, 80%, 70%, 60%, 50%
- • "Show Bounding Boxes" toggle is present (defaults to ON)
- • Changing model updates the attention visualization
- • Changing layer updates the attention visualization
- • Changing percentile updates the attention overlay
- • Tooltip help icons (?) appear next to each control label
- • Hovering tooltip icon shows educational explanation
- • Attention Method dropdown appears for models with multiple methods (e.g., DINOv2: CLS Attention, Attention Rollout)
- • Attention Method dropdown is hidden for single-method models (e.g., SigLIP shows only Mean Attention)
- • Changing attention method updates the attention visualization and metrics
Similarity Heatmap Controls (Right Column - below Control Panel)
- • "Similarity Heatmap" section header is visible
- • Heatmap Style dropdown shows options: Smooth Gradient, Squares, Circles
- • Opacity slider allows adjusting heatmap transparency (20%-90%)
- • Changing heatmap style updates the similarity visualization when bbox selected
- • Tooltip help icons (?) appear next to Heatmap Style and Opacity controls
Layer Animation Slider
- • Play/Pause button is visible
- • Navigation buttons: |< (first), < (prev), > (next), >| (last)
- • Play button auto-cycles through layers 0-11
- • Animation STOPS at last layer (does NOT loop back to 0)
- • Clicking Play when at last layer resets to layer 0 and starts playing
- • Pause button stops the animation
- • "Early layers" and "Late layers" labels are visible
IoU Metrics Display
- • IoU Score card is displayed with value
- • Max IoU progress bar shows IoU relative to theoretical maximum
- • Progress bar is color-coded: green ≥75%, yellow ≥50%, orange ≥25%, red <25% of max
- • "X% of theoretical max" label appears below progress bar
- • Coverage percentage card is shown
- • Tooltip help icons (?) appear on IoU Score and Coverage cards
- • Metrics update when model/layer/percentile changes
- • Selecting a bbox updates metrics to show per-bbox IoU/Coverage
- • Green context indicator shows "Showing metrics for: [bbox name]" when bbox selected
- • Deselecting bbox reverts to union-of-all-bboxes metrics instantly
Bounding Box Interaction
- • Bounding boxes are shown by default (toggle defaults ON)
- • Bbox overlays render on the image
- • Clicking a bbox in the annotations list selects it
- • Selected bbox is highlighted (green) in both image and annotations list
- • Feature similarity heatmap loads when bbox is selected
- • Per-bbox metrics update in the Metrics card when bbox is selected
- • Clicking outside or re-clicking deselects the bbox
Navigation Links
- • "Compare Models" link navigates to /compare with current image pre-selected
Phase 4: Compare Page
Layout
- • Compare page loads at "/compare"
- • Page layout supports side-by-side comparison
- • Selection controls are visible
Image Selection
- • Can select first image for comparison
- • Can select second image for comparison
- • Selected images display clearly
Comparison Features
- • Both images render side-by-side
- • Attention patterns can be compared visually
- • Sync/lock view controls work (if applicable)
- • Difference visualization renders (if applicable)
- • SigLIP heatmaps render correctly (not 404 errors)
- • ResNet-50 heatmaps render correctly (not 404 errors)
Phase 5: Dashboard Page
Initial Load (CRITICAL - check for blank screen!)
- • Dashboard page loads at "/dashboard"
- • Page is NOT blank - verify actual content renders
- • "Dashboard" heading is visible
- • Loading states show while data fetches (not indefinite spinner)
Model Leaderboard (Left Sidebar)
- • Leaderboard card is visible with heading
- • Ranked list shows all 6 models (dinov2, dinov3, mae, clip, siglip2, resnet50)
- • Top 3 models have medal badges (gold #1, silver #2, bronze #3)
- • Each row shows: model name, best layer, IoU score
- • Clicking a model row selects it
Layer Progression Chart (Main Area)
- • Line chart is visible (not blank/missing)
- • X-axis shows layers (L0-L11)
- • Y-axis shows IoU range (0-1)
- • Multiple colored lines render (one per model)
- • Legend identifies each model
IoU by Architectural Style (Bottom Left)
- • Bar chart is visible (not blank/missing)
- • Bars display IoU values per style
- • Style names are readable
- • "No style data available" message shows if no data
Feature Type Breakdown (Bottom Right)
- • "Feature Type Breakdown" card is visible
- • Search box for filtering features is present
- • Sort controls (IoU, Count, Name) are visible and functional
- • Feature list shows feature name, IoU score (color-coded), bbox count
- • IoU scores are color-coded: green >= 0.6, yellow >= 0.4, orange >= 0.2, red < 0.2
- • "Show more" button appears when more features available
- • Clicking sort button changes list ordering
Quick Actions Card
- • "Browse Images" link navigates to gallery
- • "Compare Models" link navigates to compare page
- • Pre-computation notice is visible (yellow alert)
Percentile Threshold Control
- • Dropdown at top shows percentile options
- • Changing percentile updates leaderboard
- • Changing percentile updates charts
Phase 6: Desktop Layout Verification
Standard Desktop (1280px width)
- • Full multi-column layouts display correctly
- • Gallery shows proper column grid
- • Image detail shows three-column layout (annotations + visualization + controls)
- • Dashboard shows leaderboard + charts side-by-side
- • Maximum content width is respected
- • Adequate whitespace and spacing
Wide Desktop (1920px width)
- • Layout remains centered and readable
- • No excessive stretching of content
- • Charts and visualizations scale appropriately
Phase 7: Error Handling
Network Errors
- • Stopping backend shows appropriate error state (not blank page)
- • Error messages are user-friendly
- • Retry or refresh guidance is provided
Invalid Routes
- • Navigating to /invalid-route shows 404 or redirects to home
- • Invalid image ID (/image/nonexistent) handled gracefully
Broken Links (CRITICAL - catch links to undefined routes!)
- • All navigation links in the app lead to valid, rendering pages
- • No internal links result in blank screens
- • Check console for React Router warnings about unmatched routes
Edge Cases
- • Empty states display when no data (not blank)
- • Missing API data shows fallback UI (not crash)
- • Very long text doesn't break layout
- • Special characters render correctly
Phase 8: Performance
- • Initial page load feels responsive (< 3 seconds)
- • Navigation between pages is instant
- • Images lazy-load appropriately
- • No visible layout shifts during load
- • No memory leaks during extended use
Quick Smoke Test Checklist
For rapid testing, verify these critical paths:
- • App loads at localhost:5173
- • Gallery page shows images (grid is populated)
- • Can click image to view detail page
- • Image detail shows attention visualization overlay
- • Control panel controls work (model, layer, percentile dropdowns)
- • Tooltip help icons (?) appear next to control labels
- • Attention Method dropdown appears for DINOv2 (CLS Attention / Attention Rollout)
- • Layer Play button stops at last layer (doesn't loop)
- • Heatmap Style dropdown (Smooth/Squares/Circles) visible in Similarity section
- • Per-bbox metrics update when bbox selected (green context indicator)
- • Max IoU progress bar shows IoU relative to theoretical max
- • Compare page loads and allows image selection
- • Dashboard page renders content (NOT blank screen!)
- • Dashboard shows leaderboard, charts, and Feature Type Breakdown
- • Feature Type Breakdown shows searchable feature list with IoU scores
- • Navigation between all pages works
- • No console errors throughout
- • No blank pages anywhere in the app
Reporting Issues
When documenting bugs, include:
- •Page/Route where issue occurred
- •Steps to reproduce
- •Expected behavior
- •Actual behavior
- •Screenshot or snapshot (use
mcp__playwright__browser_take_screenshot) - •Console messages (use
mcp__playwright__browser_console_messages)
Notes
- •Playwright's mouse simulation may not perfectly replicate human interaction
- •Use
browser_run_codefor complex JavaScript interactions - •Always take a
browser_snapshotbefore clicking to get accurate element refs - •The
browser_snapshottool is preferred over screenshots for accessibility testing
Checklist Summary
When instructed to perform Playwright testing, follow this workflow:
1. Start dev servers (./dev.sh) 2. Navigate to http://localhost:5173 3. Work through phases sequentially: PHASE 1: Navigation & Layout └── Initial Load (4 items) └── Navigation Links (4 items) PHASE 2: Gallery Page └── Layout (3 items) └── Image Display (4 items) └── Interaction (4 items) └── API Integration (2 items) PHASE 3: Image Detail Page └── Navigation (3 items) └── Layout (4 items) - three-column layout └── Annotations Panel (5 items) - left column └── Attention Viewer (5 items) - center column └── Control Panel (12 items) - includes tooltips, attention method └── Similarity Heatmap Controls (5 items) - style dropdown, opacity └── Layer Animation Slider (7 items) - includes stop-at-end behavior └── IoU Metrics Display (10 items) - includes max IoU bar, per-bbox metrics └── Bounding Box Interaction (7 items) - includes per-bbox metrics └── Navigation Links (1 item) PHASE 4: Compare Page └── Layout (3 items) └── Image Selection (3 items) └── Comparison Features (6 items) - includes SigLIP/ResNet-50 verification PHASE 5: Dashboard Page - CRITICAL (check for blank screen!) └── Initial Load (4 items) └── Model Leaderboard (5 items) └── Layer Progression Chart (5 items) └── IoU by Architectural Style (4 items) └── Feature Type Breakdown (7 items) - new component └── Quick Actions Card (3 items) └── Percentile Threshold Control (3 items) PHASE 6: Desktop Layout Verification └── Standard Desktop 1280px (6 items) └── Wide Desktop 1920px (3 items) PHASE 7: Error Handling └── Network Errors (3 items) └── Invalid Routes (2 items) └── Broken Links - CRITICAL (3 items) └── Edge Cases (4 items) PHASE 8: Performance (5 items) 4. For each item: test → document result (✅/❌) → if failed, STOP and fix 5. Provide final summary report with all results
Total checklist items: ~130 items