AgentSkillsCN

playwright-mcp

借助 Playwright MCP 工具,实现专业的浏览器自动化。可用于网页爬取、表单填写、自动化测试、截图捕获、PDF 生成,以及页面交互。适用于自动化浏览器操作、提取网页数据、填写表单、测试 Web 应用、截取屏幕截图,或以程序化方式与网页进行交互时使用。

SKILL.md
--- frontmatter
name: playwright-mcp
description: Expert browser automation using Playwright MCP tools. Enables web scraping, form filling, automated testing, screenshot capture, PDF generation, and page interaction. Use when automating browsers, extracting web data, filling forms, testing web apps, taking screenshots, or interacting with web pages programmatically.

Playwright Browser Automation Expert

Expert guidance for browser automation using Playwright MCP tools. This skill provides comprehensive capabilities for web scraping, form automation, testing, screenshot capture, and programmatic browser control.

Core Capabilities

  1. Navigation & Page Control - Navigate URLs, manage tabs, resize windows
  2. Element Interaction - Click, type, hover, drag, select options
  3. Form Automation - Fill forms, upload files, submit data
  4. Content Extraction - Accessibility snapshots, screenshots, console logs
  5. Testing & Verification - Verify elements, text, values
  6. Advanced Features - PDF generation, tracing, custom JavaScript

Quick Reference - Available MCP Tools

Navigation Tools

ToolPurpose
browser_navigateNavigate to a URL
browser_navigate_backGo back to previous page
browser_tabsList, create, close, or select tabs
browser_closeClose the current page
browser_resizeResize browser window

Interaction Tools

ToolPurpose
browser_clickClick on elements
browser_typeType text into fields
browser_fill_formFill multiple form fields at once
browser_select_optionSelect dropdown options
browser_hoverHover over elements
browser_dragDrag and drop between elements
browser_press_keyPress keyboard keys
browser_file_uploadUpload files
browser_handle_dialogHandle alert/confirm/prompt dialogs

Information Retrieval Tools

ToolPurpose
browser_snapshotGet accessibility tree snapshot (preferred over screenshot)
browser_take_screenshotCapture visual screenshot
browser_console_messagesGet console log messages
browser_network_requestsGet all network requests

Advanced Tools

ToolPurpose
browser_evaluateExecute JavaScript on page
browser_run_codeRun Playwright code snippets
browser_wait_forWait for text/element/time
browser_installInstall browser if missing

Instructions

Step 1: Understanding the Page (ALWAYS DO FIRST)

Before interacting with any page, ALWAYS capture an accessibility snapshot:

code
Use browser_snapshot to understand the page structure.
This returns element references (ref) needed for interactions.

The snapshot provides:

  • Element references (ref) for targeting
  • Element types (button, textbox, link, etc.)
  • Accessible names and descriptions
  • Page structure hierarchy

Step 2: Navigation

Navigate to URLs and manage browser state:

code
browser_navigate: url="https://example.com"
browser_navigate_back: (no parameters)
browser_tabs: action="list" | "new" | "close" | "select", index=N

Step 3: Element Interaction

All interactions require two key parameters:

  • element: Human-readable description (for logging/permission)
  • ref: Exact element reference from snapshot

Clicking:

code
browser_click:
  element="Submit button"
  ref="button[Submit]"
  button="left" | "right" | "middle"  (optional)
  doubleClick=true/false (optional)

Typing:

code
browser_type:
  element="Search input"
  ref="textbox[Search]"
  text="search query"
  submit=true  (optional - press Enter after)
  slowly=true  (optional - type character by character)

Form Filling (Multiple Fields):

code
browser_fill_form:
  fields=[
    {"name": "Username", "type": "textbox", "ref": "textbox[Username]", "value": "john"},
    {"name": "Password", "type": "textbox", "ref": "textbox[Password]", "value": "secret"},
    {"name": "Remember", "type": "checkbox", "ref": "checkbox[Remember]", "value": "true"}
  ]

Step 4: Waiting and Synchronization

Wait for page state before proceeding:

code
browser_wait_for:
  text="Welcome"           # Wait for text to appear
  textGone="Loading..."    # Wait for text to disappear
  time=5                   # Wait N seconds

Step 5: Data Extraction

Accessibility Snapshot (Preferred):

code
browser_snapshot:
  filename="page-snapshot.md"  (optional - save to file)

Screenshot:

code
browser_take_screenshot:
  type="png" | "jpeg"
  fullPage=true/false
  element="Header section"  (optional - screenshot specific element)
  ref="header[Main]"        (optional - with element)
  filename="screenshot.png" (optional)

Console Messages:

code
browser_console_messages:
  level="error" | "warning" | "info" | "debug"

Network Requests:

code
browser_network_requests:
  includeStatic=true/false

Common Workflows

Web Scraping Workflow

  1. Navigate to target URL
  2. Take accessibility snapshot to understand structure
  3. Extract data from snapshot or use evaluate for complex extraction
  4. Handle pagination if needed
  5. Close browser when done
code
Example sequence:
1. browser_navigate(url="https://example.com/products")
2. browser_snapshot()
3. browser_evaluate(function="() => { return [...document.querySelectorAll('.product')].map(p => p.textContent) }")
4. browser_click(element="Next page", ref="link[Next]")
5. Repeat 2-4 until done

Form Automation Workflow

  1. Navigate to form page
  2. Take snapshot to identify form fields
  3. Fill all fields using browser_fill_form
  4. Handle any file uploads
  5. Submit and verify success
code
Example sequence:
1. browser_navigate(url="https://example.com/signup")
2. browser_snapshot()
3. browser_fill_form(fields=[...])
4. browser_file_upload(paths=["C:/docs/resume.pdf"])
5. browser_click(element="Submit", ref="button[Submit]")
6. browser_wait_for(text="Success")

Testing Workflow

  1. Navigate to application
  2. Perform actions
  3. Verify expected state
  4. Capture evidence (screenshots, snapshots)
code
Example sequence:
1. browser_navigate(url="https://app.example.com")
2. browser_type(element="Username", ref="textbox[Username]", text="testuser")
3. browser_type(element="Password", ref="textbox[Password]", text="password", submit=true)
4. browser_wait_for(text="Dashboard")
5. browser_take_screenshot(filename="login-success.png")

Best Practices

1. Always Snapshot First

Never try to interact with elements without first taking a snapshot. The snapshot provides the exact ref values needed.

2. Use Descriptive Element Names

The element parameter should clearly describe what you're interacting with for better logging and debugging.

3. Wait Appropriately

After navigation or actions that trigger page changes, use browser_wait_for to ensure the page is ready.

4. Prefer Accessibility Snapshots Over Screenshots

Snapshots are:

  • Faster and lighter
  • More accurate for element identification
  • Better for LLM processing
  • Don't require vision capabilities

5. Handle Errors Gracefully

  • Check for dialogs using browser_handle_dialog
  • Verify page state after actions
  • Use console messages for debugging

6. Use Semantic Selectors

Reference elements by their accessible names rather than fragile CSS selectors when possible.


Tool Parameter Reference

browser_click

ParameterTypeRequiredDescription
elementstringYesHuman-readable element description
refstringYesExact element reference from snapshot
buttonstringNo"left", "right", or "middle"
doubleClickbooleanNoPerform double-click
modifiersarrayNo["Alt", "Control", "Meta", "Shift"]

browser_type

ParameterTypeRequiredDescription
elementstringYesHuman-readable element description
refstringYesExact element reference from snapshot
textstringYesText to type
submitbooleanNoPress Enter after typing
slowlybooleanNoType one character at a time

browser_fill_form

ParameterTypeRequiredDescription
fieldsarrayYesArray of field objects

Field object structure:

json
{
  "name": "Field name",
  "type": "textbox|checkbox|radio|combobox|slider",
  "ref": "element reference",
  "value": "value to set"
}

browser_navigate

ParameterTypeRequiredDescription
urlstringYesURL to navigate to

browser_tabs

ParameterTypeRequiredDescription
actionstringYes"list", "new", "close", or "select"
indexnumberNoTab index for close/select

browser_wait_for

ParameterTypeRequiredDescription
textstringNoText to wait for
textGonestringNoText to wait to disappear
timenumberNoSeconds to wait

browser_take_screenshot

ParameterTypeRequiredDescription
typestringNo"png" or "jpeg" (default: png)
fullPagebooleanNoCapture full scrollable page
elementstringNoElement description to screenshot
refstringNoElement reference (with element)
filenamestringNoSave filename

browser_evaluate

ParameterTypeRequiredDescription
functionstringYesJavaScript function to execute
elementstringNoElement description (if targeting element)
refstringNoElement reference (if targeting element)

browser_run_code

ParameterTypeRequiredDescription
codestringYesPlaywright code snippet

Example:

javascript
async (page) => {
  await page.getByRole('button', { name: 'Submit' }).click();
  return await page.title();
}

When to Use This Skill

  • Automating web browser interactions
  • Scraping data from websites
  • Filling out web forms automatically
  • Taking screenshots of web pages
  • Testing web applications
  • Extracting content from dynamic pages
  • Automating repetitive browser tasks
  • Generating PDFs from web pages
  • Debugging web applications

Keywords

playwright, browser automation, web scraping, form filling, screenshot, pdf, testing, selenium alternative, puppeteer alternative, web testing, accessibility snapshot, headless browser, browser control, click, type, navigate, web interaction, dom, element interaction