AgentSkillsCN

agent-browser

通过代理浏览器CLI实现浏览器自动化。支持截图、网页爬取、无障碍性审计、表单自动化,以及实时视口流媒体传输。 适用场景如下: - 截取屏幕截图或提取页面内容 - 自动化Web表单与UI交互 - 进行无障碍性审计 - 为AI代理进行网络爬取 - 测试Web应用

SKILL.md
--- frontmatter
name: agent-browser
description: |
  Browser automation via agent-browser CLI. Screenshots, scraping, accessibility audits,
  form automation, and live viewport streaming.

  Use when:
  - Taking screenshots or extracting page content
  - Automating web forms and UI interactions
  - Accessibility audits
  - Web scraping for AI agents
  - Testing web applications

Agent Browser

Use the agent-browser CLI for headless browser automation.

Prerequisites

bash
# Check if installed
which agent-browser

# Install via npm
npm install -g agent-browser

# Or use npx (no install)
npx agent-browser --help

Core Commands

bash
# Open a URL
agent-browser open https://example.com

# Take screenshot
agent-browser screenshot --output screenshot.png

# Get page snapshot (accessibility tree + content)
agent-browser snapshot

# Click element
agent-browser click "button.submit"

# Fill form field
agent-browser fill "input[name=email]" "user@example.com"

# Extract text content
agent-browser text "main"

# Get page HTML
agent-browser html

Common Workflows

Screenshot a Page

bash
agent-browser open https://example.com
agent-browser screenshot --output page.png

Fill and Submit Form

bash
agent-browser open https://example.com/login
agent-browser fill "#email" "user@example.com"
agent-browser fill "#password" "secret"
agent-browser click "button[type=submit]"
agent-browser wait 2000
agent-browser screenshot --output result.png

Extract Page Content

bash
agent-browser open https://example.com
agent-browser snapshot  # Returns accessibility tree + text

Accessibility Audit

bash
agent-browser open https://example.com
agent-browser a11y  # Returns accessibility violations

Live Preview

Stream the browser viewport via WebSocket:

bash
agent-browser open https://example.com --stream
# Opens ws://localhost:9222 for live viewport

Quick Reference

CommandPurpose
open URLNavigate to URL
screenshotCapture viewport
snapshotGet accessibility tree + content
click SELECTORClick element
fill SELECTOR VALUEFill input field
text SELECTORExtract text content
htmlGet page HTML
a11yAccessibility audit
wait MSWait for milliseconds

For full reference: https://github.com/vercel-labs/agent-browser