AgentSkillsCN

agent-browser

通过 `npx agent-browser` CLI 实现 AI 代理的浏览器自动化操作。可用于网站测试、UI 变更验证、控制台错误排查、截屏记录,以及用户流程的自动化执行。支持“浏览器测试”“UI 验证”“网页自动化”“截屏”“表单测试”等多种触发场景。

SKILL.md
--- frontmatter
name: agent-browser
description: Browser automation for AI agents using `npx agent-browser` CLI. Use for testing websites, verifying UI changes, checking console errors, taking screenshots, and automating user flows. Triggers on "browser test", "UI verification", "web automation", "screenshot", "form test".
user-invocable: false
context: fork
agent: Explore
allowed-tools:
  - Bash
  - Read

Agent Browser Skill

Browser automation CLI optimized for AI agents. All commands use npx agent-browser.

Core Workflow

Always follow: snapshot → identify refs → act → snapshot again

bash
npx agent-browser open http://localhost:4321    # Navigate
npx agent-browser snapshot -i                    # Get interactive elements with refs
npx agent-browser click @e2                      # Act using ref from snapshot
npx agent-browser snapshot -i                    # Get updated state

Important: Always use snapshot -i (interactive elements only) by default. This dramatically reduces output by filtering to buttons, links, inputs, and actionable elements.

Quick Reference

Navigation

CommandDescription
open <url>Navigate to URL
backGo back
reloadReload page
closeClose browser

Page Analysis

CommandDescription
snapshot -iInteractive elements only (preferred)
snapshotFull accessibility tree
snapshot -i -cInteractive + compact (minimal)
errorsCheck for JS errors
consoleView console logs

Interaction

CommandDescription
click @e1Click element
fill @e2 "text"Clear and type
type @e2 "text"Type without clearing
press EnterPress key
hover @e3Hover element
scroll down 500Scroll page

Capture

CommandDescription
screenshotScreenshot viewport
screenshot --fullFull page
screenshot /path/file.pngSave to path
pdf /path/file.pdfSave as PDF

Get Information

CommandDescription
get text @e1Get element text
get urlGet current URL
get titleGet page title
get value @e2Get input value

Common Workflows

Test Local Dev Server

bash
npx agent-browser open http://localhost:4321
npx agent-browser snapshot -i
npx agent-browser errors
npx agent-browser screenshot .claude/tmp/screenshots/homepage.png

Test Form Submission

bash
npx agent-browser snapshot -i
npx agent-browser fill @e3 "test@example.com"
npx agent-browser fill @e4 "message"
npx agent-browser click @e5
npx agent-browser snapshot -i

Test Navigation Flow

bash
npx agent-browser snapshot -i
npx agent-browser click @e5
npx agent-browser snapshot -i

Snapshot Output Format

code
- document:
  - button "Toggle theme" [ref=e1]
  - main:
    - heading "Title" [ref=e2] [level=1]
    - link "About" [ref=e3]:
      - /url: /about
    - textbox "Email" [ref=e4]
  • [ref=eN] - Element reference for targeting
  • /url: - Link href
  • [level=N] - Heading level

Snapshot Options

FlagPurpose
-iInteractive elements only (recommended)
-cCompact output
-d 3Limit tree depth
-s "#main"Scope to CSS selector

Alternative Selectors

When refs aren't available:

bash
npx agent-browser click "#submit"           # CSS selector
npx agent-browser click "text=Sign In"      # Text content

Sessions (Multiple Browsers)

bash
npx agent-browser --session test1 open site-a.com
npx agent-browser --session test2 open site-b.com

Troubleshooting

IssueSolution
Browser not installednpx agent-browser install
Element not foundRun snapshot again - refs change on update
Page still loadingnpx agent-browser wait 2000 or wait --load networkidle

Notes

  • Save screenshots to .claude/tmp/screenshots/
  • Refs (@e1, @e2) are stable identifiers from snapshots
  • Prefer refs over CSS selectors when possible