Agent Browser
Setup
Install the CLI and download Chromium:
bash
npm install -g agent-browser agent-browser install
Set a custom browser binary if needed:
bash
AGENT_BROWSER_EXECUTABLE_PATH=/path/to/chromium agent-browser open https://example.com
Quick Start
bash
agent-browser open https://example.com agent-browser snapshot -i --json agent-browser click @e2 agent-browser fill @e3 "test@example.com" agent-browser get text @e1 agent-browser screenshot page.png agent-browser close
Snapshot + Ref Workflow
- •Open a page with
agent-browser open <url>. - •Capture a focused tree with
agent-browser snapshot -i -c -d 5 --json. - •Choose refs (
@e1,@e2, ...) from the snapshot and act withclick,fill,press, orhover. - •Re-run
snapshotafter navigation or UI changes. - •Close the session with
agent-browser close.
Common Commands
- •Navigate:
open,wait --url,get url,get title - •Interact:
click,dblclick,fill,type,press,hover,scroll - •Extract:
snapshot,get text,get html,get value,get attr - •Output:
screenshot --full,pdf <path> - •State:
cookies,storage local,storage session - •Settings:
set viewport,set headers <json>,set device,set geo,set offline
Session + Debug
- •Use isolated sessions with
--session <name>orAGENT_BROWSER_SESSION. - •Show a visible window with
--headedfor debugging. - •Attach to an existing browser via CDP with
--cdp 9222. - •Prefer
--jsonfor machine-readable output.