Pinchtab
Fast, lightweight browser control for AI agents via HTTP + accessibility tree.
Setup
Ensure Pinchtab is running:
# Headless (default for automation) BRIDGE_HEADLESS=true pinchtab & # With UI (debugging) pinchtab &
Default port: 9867. Override with BRIDGE_PORT=9868.
Auth: set BRIDGE_TOKEN=<secret> and pass Authorization: Bearer <secret>.
Base URL for all examples: http://localhost:9867
Core Workflow
The typical agent loop:
- •Navigate to a URL
- •Snapshot the accessibility tree (get refs)
- •Act on refs (click, type, press)
- •Snapshot again to see results
Refs (e.g. e0, e5, e12) are cached per tab after each snapshot — no need to re-snapshot before every action unless the page changed significantly.
API Reference
Navigate
curl -X POST http://localhost:9867/navigate \
-H 'Content-Type: application/json' \
-d '{"url": "https://example.com"}'
# With options: custom timeout, block images, open in new tab
curl -X POST http://localhost:9867/navigate \
-H 'Content-Type: application/json' \
-d '{"url": "https://example.com", "timeout": 60, "blockImages": true, "newTab": true}'
Snapshot (accessibility tree)
# Full tree curl http://localhost:9867/snapshot # Interactive elements only (buttons, links, inputs) — much smaller curl "http://localhost:9867/snapshot?filter=interactive" # Limit depth curl "http://localhost:9867/snapshot?depth=5" # Smart diff — only changes since last snapshot (massive token savings) curl "http://localhost:9867/snapshot?diff=true" # Text format — indented tree, ~40-60% fewer tokens than JSON curl "http://localhost:9867/snapshot?format=text" # Compact format — one-line-per-node, 56-64% fewer tokens than JSON (recommended) curl "http://localhost:9867/snapshot?format=compact" # YAML format curl "http://localhost:9867/snapshot?format=yaml" # Scope to CSS selector (e.g. main content only) curl "http://localhost:9867/snapshot?selector=main" # Truncate to ~N tokens curl "http://localhost:9867/snapshot?maxTokens=2000" # Combine for maximum efficiency curl "http://localhost:9867/snapshot?format=compact&selector=main&maxTokens=2000&filter=interactive" # Disable animations before capture curl "http://localhost:9867/snapshot?noAnimations=true" # Write to file curl "http://localhost:9867/snapshot?output=file&path=/tmp/snapshot.json"
Returns flat JSON array of nodes with ref, role, name, depth, value, nodeId.
Token optimization: Use ?format=compact for best token efficiency. Add ?filter=interactive for action-oriented tasks (~75% fewer nodes). Use ?selector=main to scope to relevant content. Use ?maxTokens=2000 to cap output. Use ?diff=true on multi-step workflows to see only changes. Combine all params freely.
Act on elements
# Click by ref
curl -X POST http://localhost:9867/action \
-H 'Content-Type: application/json' \
-d '{"kind": "click", "ref": "e5"}'
# Type into focused element (click first, then type)
curl -X POST http://localhost:9867/action \
-H 'Content-Type: application/json' \
-d '{"kind": "click", "ref": "e12"}'
curl -X POST http://localhost:9867/action \
-H 'Content-Type: application/json' \
-d '{"kind": "type", "ref": "e12", "text": "hello world"}'
# Press a key
curl -X POST http://localhost:9867/action \
-H 'Content-Type: application/json' \
-d '{"kind": "press", "key": "Enter"}'
# Focus an element
curl -X POST http://localhost:9867/action \
-H 'Content-Type: application/json' \
-d '{"kind": "focus", "ref": "e3"}'
# Fill (set value directly, no keystrokes)
curl -X POST http://localhost:9867/action \
-H 'Content-Type: application/json' \
-d '{"kind": "fill", "selector": "#email", "text": "user@example.com"}'
# Hover (trigger dropdowns/tooltips)
curl -X POST http://localhost:9867/action \
-H 'Content-Type: application/json' \
-d '{"kind": "hover", "ref": "e8"}'
# Select dropdown option (by value or visible text)
curl -X POST http://localhost:9867/action \
-H 'Content-Type: application/json' \
-d '{"kind": "select", "ref": "e10", "value": "option2"}'
# Scroll to element
curl -X POST http://localhost:9867/action \
-H 'Content-Type: application/json' \
-d '{"kind": "scroll", "ref": "e20"}'
# Scroll by pixels (infinite scroll pages)
curl -X POST http://localhost:9867/action \
-H 'Content-Type: application/json' \
-d '{"kind": "scroll", "scrollY": 800}'
# Click and wait for navigation (link clicks)
curl -X POST http://localhost:9867/action \
-H 'Content-Type: application/json' \
-d '{"kind": "click", "ref": "e5", "waitNav": true}'
Extract text
# Readability mode (default) — strips nav/footer/ads, keeps article/main content curl http://localhost:9867/text # Raw innerText (old behavior) curl "http://localhost:9867/text?mode=raw"
Returns {url, title, text}. Cheapest option (~1K tokens for most pages).
Screenshot
# Raw JPEG bytes curl "http://localhost:9867/screenshot?raw=true" -o screenshot.jpg # With quality setting (default 80) curl "http://localhost:9867/screenshot?raw=true&quality=50" -o screenshot.jpg
Evaluate JavaScript
curl -X POST http://localhost:9867/evaluate \
-H 'Content-Type: application/json' \
-d '{"expression": "document.title"}'
Tab management
# List tabs
curl http://localhost:9867/tabs
# Open new tab
curl -X POST http://localhost:9867/tab \
-H 'Content-Type: application/json' \
-d '{"action": "new", "url": "https://example.com"}'
# Close tab
curl -X POST http://localhost:9867/tab \
-H 'Content-Type: application/json' \
-d '{"action": "close", "tabId": "TARGET_ID"}'
Multi-tab: pass ?tabId=TARGET_ID to snapshot/screenshot/text, or "tabId" in POST body.
Tab locking (multi-agent)
# Lock a tab (default 30s timeout, max 5min)
curl -X POST http://localhost:9867/tab/lock \
-H 'Content-Type: application/json' \
-d '{"tabId": "TARGET_ID", "owner": "agent-1", "timeoutSec": 60}'
# Unlock
curl -X POST http://localhost:9867/tab/unlock \
-H 'Content-Type: application/json' \
-d '{"tabId": "TARGET_ID", "owner": "agent-1"}'
Locked tabs show owner and lockedUntil in /tabs. Returns 409 on conflict.
Batch actions
# Execute multiple actions in sequence
curl -X POST http://localhost:9867/actions \
-H 'Content-Type: application/json' \
-d '[{"kind":"click","ref":"e3"},{"kind":"type","ref":"e3","text":"hello"},{"kind":"press","key":"Enter"}]'
Cookies
# Get cookies for current page
curl http://localhost:9867/cookies
# Set cookies
curl -X POST http://localhost:9867/cookies \
-H 'Content-Type: application/json' \
-d '{"url":"https://example.com","cookies":[{"name":"session","value":"abc123"}]}'
Stealth
# Check stealth status and score
curl http://localhost:9867/stealth/status
# Rotate browser fingerprint
curl -X POST http://localhost:9867/fingerprint/rotate \
-H 'Content-Type: application/json' \
-d '{"os":"windows"}'
# os: "windows", "mac", or omit for random
Health check
curl http://localhost:9867/health
Token Cost Guide
| Method | Typical tokens | When to use |
|---|---|---|
/text | ~800 | Reading page content |
/snapshot?filter=interactive | ~3,600 | Finding buttons/links to click |
/snapshot?diff=true | varies | Multi-step workflows (only changes) |
/snapshot?format=compact | ~56-64% less | One-line-per-node, best token efficiency |
/snapshot?format=text | ~40-60% less | Indented tree, cheaper than JSON |
/snapshot | ~10,500 | Full page understanding |
/screenshot | ~2K (vision) | Visual verification |
Strategy: Start with /snapshot?filter=interactive. Use ?diff=true on subsequent snapshots in multi-step tasks. Use /text when you only need the readable content. Use ?format=text to cut token costs further. Use full /snapshot only for complete page understanding.
Environment Variables
| Var | Default | Description |
|---|---|---|
BRIDGE_PORT | 9867 | HTTP port |
BRIDGE_HEADLESS | false | Run Chrome headless |
BRIDGE_TOKEN | (none) | Bearer auth token |
BRIDGE_PROFILE | ~/.pinchtab/chrome-profile | Chrome profile dir |
BRIDGE_STATE_DIR | ~/.pinchtab | State/session storage |
BRIDGE_NO_RESTORE | false | Skip tab restore on startup |
BRIDGE_STEALTH | light | Stealth level: light or full |
BRIDGE_BLOCK_IMAGES | false | Block image loading |
BRIDGE_BLOCK_MEDIA | false | Block all media (images + fonts + CSS + video) |
BRIDGE_NO_ANIMATIONS | false | Disable CSS animations/transitions |
CHROME_BINARY | (auto) | Path to Chrome/Chromium binary |
CHROME_FLAGS | (none) | Extra Chrome flags (space-separated) |
BRIDGE_CONFIG | ~/.pinchtab/config.json | Path to config JSON file |
BRIDGE_TIMEOUT | 15 | Action timeout (seconds) |
BRIDGE_NAV_TIMEOUT | 30 | Navigation timeout (seconds) |
CDP_URL | (none) | Connect to existing Chrome DevTools |
Tips
- •Always pass
tabIdexplicitly when working with multiple tabs — active tab tracking can be unreliable - •Refs are stable between snapshot and actions — no need to re-snapshot before clicking
- •After navigation or major page changes, take a new snapshot to get fresh refs
- •Use
filter=interactiveby default, fall back to full snapshot when needed - •Pinchtab persists sessions — tabs survive restarts (disable with
BRIDGE_NO_RESTORE=true) - •Chrome profile is persistent — cookies/logins carry over between runs
- •Chrome uses its native User-Agent by default —
BRIDGE_CHROME_VERSIONonly affects fingerprint rotation - •Use
BRIDGE_BLOCK_IMAGES=trueor"blockImages": trueon navigate for read-heavy tasks — reduces bandwidth and memory