AgentSkillsCN

Browser

浏览器

SKILL.md

Browser Automation Skill

This skill handles browser automation tasks using subagent architecture to keep the main context window clean.

Core Principle

NEVER call Claude-in-Chrome MCP tools directly in main context.

All browser automation is delegated to the browser-automation Task subagent which returns concise summaries.

Detection Triggers

Activate this skill when detecting:

Web Testing

  • "test the website", "check the page", "verify the UI"
  • "screenshot", "capture the screen", "take a picture"
  • "does the site work", "is the page loading"

Form Automation

  • "fill out the form", "submit this form", "enter data"
  • "sign up", "register", "create account" (with explicit user permission)
  • "login", "authenticate" (without entering passwords directly)

Data Extraction

  • "scrape this page", "extract data from", "get the text"
  • "what's on the page", "read the content", "parse the data"
  • "find the price", "get the title", "extract the links"

Workflow Automation

  • "automate this workflow", "record these steps", "create a GIF"
  • "click through", "navigate to", "go to page"
  • "test the flow", "verify the process"

Debugging & Monitoring

  • "check console logs", "monitor network requests", "debug the page"
  • "why is this failing", "what's the error", "inspect the page"

Subagent Delegation Pattern

When a browser automation task is detected:

code
Task("Browser: {brief_description}

**Task:** {user_request}
**URL:** {target_url if provided}
**Actions:** {list of expected actions}

Using Claude-in-Chrome MCP tools:
1. Start with tabs_context_mcp to get current tab state
2. Create new tab or use existing based on context
3. Navigate, interact, extract data as needed
4. Capture screenshots/GIFs for multi-step workflows
5. Monitor console/network if debugging

Return ONLY a concise summary:
- Actions taken (e.g., "Navigated to X, filled form, clicked submit")
- Key findings (e.g., "Form submitted successfully", "Error: XYZ")
- Screenshots/visual evidence if captured
- Any errors or unexpected behavior

Do NOT include:
- Raw HTML or page source
- Verbose console logs (use pattern filtering)
- Detailed network request payloads
- Implementation details

subagent_type: browser-automation
")

Examples

Example 1: Test a Website

User: "Test if the login page at example.com works"

Delegation:

code
Task("Browser: Test login page functionality

**Task:** Verify login page at example.com loads and form is interactive
**URL:** https://example.com/login
**Actions:**
- Navigate to URL
- Screenshot the page
- Verify form elements exist (username, password, submit)
- Check console for errors

subagent_type: browser-automation
")

Example 2: Extract Data

User: "Get all product prices from shop.example.com/products"

Delegation:

code
Task("Browser: Extract product prices

**Task:** Scrape product prices from shop.example.com/products
**URL:** https://shop.example.com/products
**Actions:**
- Navigate to products page
- Find all price elements
- Extract text content
- Return structured list

subagent_type: browser-automation
")

Example 3: Workflow Automation

User: "Record a GIF of searching for 'laptops' on the site"

Delegation:

code
Task("Browser: Record search workflow

**Task:** Create GIF recording of search process
**URL:** {provided or current}
**Actions:**
- Start GIF recording
- Navigate to search
- Enter 'laptops' in search box
- Submit search
- Wait for results
- Stop recording and export GIF

subagent_type: browser-automation
")

Security Reminders

The browser-automation agent is configured to:

  • Never enter sensitive financial data
  • Never enter passwords directly
  • Request permission for purchases/downloads
  • Avoid triggering browser dialogs

Response Format

After subagent returns, summarize for user:

code
{Brief action summary}

{Key findings or results}

{Screenshot/GIF if captured}

{Next steps or recommendations if applicable}

Keep it concise and actionable.