Browser Automation with Chrome DevTools Protocol
Control Chrome browser programmatically using simple command-line scripts. All scripts auto-start Chrome in headless mode if not running.
Global Options (All Commands)
All commands support these options for session persistence and interactive use:
- •
--no-headless- Run Chrome with a visible window (required for interactive sites, login, etc.) - •
--user-data=PATH- Use a specific Chrome profile directory to preserve cookies, logins, and session state between runs
Example: Persistent session for logged-in sites
# First time: login interactively with visible browser
${CLAUDE_PLUGIN_ROOT}/bin/navigate https://instacart.com --no-headless --user-data=~/.chrome-instacart
# After logging in manually, future commands use same profile (cookies preserved)
${CLAUDE_PLUGIN_ROOT}/bin/navigate https://instacart.com/store/market-32 --user-data=~/.chrome-instacart
${CLAUDE_PLUGIN_ROOT}/bin/screenshot https://instacart.com/account --user-data=~/.chrome-instacart /tmp/account.png
Available Commands
All scripts are in ${CLAUDE_PLUGIN_ROOT}/bin/:
screenshot - Capture web pages as images
${CLAUDE_PLUGIN_ROOT}/bin/screenshot URL [OUTPUT] [OPTIONS]
Options:
- •
--full-page- Capture entire scrollable page - •
--selector=CSS- Capture specific element - •
--format=png|jpeg|webp- Output format (default: png) - •
--quality=N- JPEG/WebP quality 0-100 - •
--width=N --height=N- Set viewport size - •
--max-dimension=N- Max output dimension (default: 8000, auto-scales large pages)
Examples:
# Basic screenshot
${CLAUDE_PLUGIN_ROOT}/bin/screenshot https://example.com /tmp/page.png
# Full page as JPEG
${CLAUDE_PLUGIN_ROOT}/bin/screenshot https://example.com /tmp/full.jpg --full-page --format=jpeg
# Capture specific element
${CLAUDE_PLUGIN_ROOT}/bin/screenshot https://example.com /tmp/header.png --selector="header"
pdf - Generate PDFs from web pages
${CLAUDE_PLUGIN_ROOT}/bin/pdf URL [OUTPUT] [OPTIONS]
Options:
- •
--paper=letter|a4|legal|a3|a5|tabloid- Paper size (default: letter) - •
--landscape- Landscape orientation - •
--margin=INCHES- All margins (default: 0.4) - •
--scale=FACTOR- Scale 0.1-2.0 (default: 1.0) - •
--no-background- Skip background colors/images
Examples:
# Basic PDF
${CLAUDE_PLUGIN_ROOT}/bin/pdf https://example.com /tmp/doc.pdf
# A4 landscape
${CLAUDE_PLUGIN_ROOT}/bin/pdf https://example.com /tmp/report.pdf --paper=a4 --landscape
# Tight margins
${CLAUDE_PLUGIN_ROOT}/bin/pdf https://example.com /tmp/compact.pdf --margin=0.25
extract - Extract content from web pages
${CLAUDE_PLUGIN_ROOT}/bin/extract URL [OPTIONS]
Options:
- •
--format=markdown|text|html- Output format (default: markdown) - •
--selector=CSS- Extract specific element only - •
--links- Also list all links - •
--images- Also list all images - •
--metadata- Also show page metadata
Examples:
# Get page as markdown
${CLAUDE_PLUGIN_ROOT}/bin/extract https://example.com
# Get plain text from article
${CLAUDE_PLUGIN_ROOT}/bin/extract https://example.com --format=text --selector="article"
# Get all links and metadata
${CLAUDE_PLUGIN_ROOT}/bin/extract https://example.com --links --metadata
navigate - Navigate and interact with pages
${CLAUDE_PLUGIN_ROOT}/bin/navigate URL [OPTIONS]
Options:
- •
--wait-for=SELECTOR- Wait for element to appear - •
--click=SELECTOR- Click an element - •
--type=SELECTOR=TEXT- Type text into input field - •
--eval=JAVASCRIPT- Execute JavaScript and print result - •
--timeout=SECONDS- Timeout (default: 30)
Examples:
# Navigate and wait for content
${CLAUDE_PLUGIN_ROOT}/bin/navigate https://example.com --wait-for="#content"
# Fill form and submit
${CLAUDE_PLUGIN_ROOT}/bin/navigate https://google.com --type="input[name=q]=hello" --click="input[type=submit]"
# Get page title via JavaScript
${CLAUDE_PLUGIN_ROOT}/bin/navigate https://example.com --eval="document.title"
form - Fill out and submit web forms
${CLAUDE_PLUGIN_ROOT}/bin/form URL [OPTIONS]
Options:
- •
--fill=SELECTOR=VALUE- Fill input field (can repeat) - •
--select=SELECTOR=VALUE- Select dropdown option (can repeat) - •
--fill-json='{"sel":"val"}'- Fill multiple fields from JSON - •
--submit=SELECTOR- Click submit button after filling - •
--wait-for=SELECTOR- Wait for element before filling - •
--wait-after=SELECTOR- Wait for element after submit - •
--screenshot=PATH- Take screenshot after completion
Examples:
# Login form
${CLAUDE_PLUGIN_ROOT}/bin/form https://example.com/login \
--fill='#username=john' \
--fill='#password=secret' \
--submit='button[type=submit]'
# Form with dropdowns
${CLAUDE_PLUGIN_ROOT}/bin/form https://example.com/register \
--fill='#name=John Doe' \
--fill='#email=john@example.com' \
--select='#country=US' \
--submit='#register-btn'
# Using JSON
${CLAUDE_PLUGIN_ROOT}/bin/form https://example.com/contact \
--fill-json='{"#name":"John","#email":"john@test.com"}' \
--submit='button.send'
record - Record screencast frames
${CLAUDE_PLUGIN_ROOT}/bin/record URL OUTPUT_DIR [OPTIONS]
Options:
- •
--duration=SECONDS- Recording duration (default: 5) - •
--count=N- Exact number of frames to capture - •
--format=jpeg|png- Frame format (default: jpeg) - •
--quality=N- JPEG quality 0-100 (default: 80) - •
--max-width=N- Maximum frame width - •
--max-height=N- Maximum frame height - •
--fps=N- Approximate frames per second (default: 10)
Examples:
# Record 5 seconds
${CLAUDE_PLUGIN_ROOT}/bin/record https://example.com /tmp/frames
# Record 30 PNG frames
${CLAUDE_PLUGIN_ROOT}/bin/record https://example.com /tmp/frames --count=30 --format=png
# Convert to video with ffmpeg
ffmpeg -framerate 10 -i /tmp/frames/frame-%04d.jpg -c:v libx264 output.mp4
interact - Interact with current page (no navigation)
${CLAUDE_PLUGIN_ROOT}/bin/interact [OPTIONS]
This is the key command for multi-step workflows. Unlike navigate, this command connects to the already-open browser tab and performs actions WITHOUT reloading the page. Essential for:
- •Multi-step processes (search → click result → add to cart)
- •Single-page applications (SPAs)
- •Sites where navigation resets state
Options:
- •
--click=SELECTOR- Click element matching CSS selector - •
--type=SELECTOR=TEXT- Type text into input field - •
--eval=JAVASCRIPT- Execute JavaScript and print result - •
--wait-for=SELECTOR- Wait for element to appear - •
--select=SELECTOR=VALUE- Select dropdown option - •
--focus=SELECTOR- Focus an element - •
--text=SELECTOR- Get text content of element - •
--timeout=SECONDS- Timeout (default: 30)
Examples:
# Execute JavaScript on current page
${CLAUDE_PLUGIN_ROOT}/bin/interact --eval="document.title" --user-data=~/.chrome-mysite
# Click a button on current page
${CLAUDE_PLUGIN_ROOT}/bin/interact --click="#submit-btn" --user-data=~/.chrome-mysite
# Type into an input without navigating away
${CLAUDE_PLUGIN_ROOT}/bin/interact --type="#search=query" --user-data=~/.chrome-mysite
# Chain actions: focus, type, then click
${CLAUDE_PLUGIN_ROOT}/bin/interact --focus="#search" --type="#search=pork" --click="#search-btn" --user-data=~/.chrome-mysite
# Get text from an element
${CLAUDE_PLUGIN_ROOT}/bin/interact --text="h1.title" --user-data=~/.chrome-mysite
chrome-status - Check browser status
${CLAUDE_PLUGIN_ROOT}/bin/chrome-status
Shows whether Chrome is running, version info, and open tabs.
Common Workflows
Screenshot a page
${CLAUDE_PLUGIN_ROOT}/bin/screenshot https://example.com /tmp/screenshot.png
Convert page to PDF
${CLAUDE_PLUGIN_ROOT}/bin/pdf https://example.com /tmp/document.pdf --paper=a4
Scrape page content
${CLAUDE_PLUGIN_ROOT}/bin/extract https://example.com --format=markdown
Fill and submit a form
${CLAUDE_PLUGIN_ROOT}/bin/form https://example.com/login \
--fill='#username=user' \
--fill='#password=pass' \
--submit='button[type=submit]' \
--wait-after='.dashboard'
Record a screencast
${CLAUDE_PLUGIN_ROOT}/bin/record https://example.com /tmp/frames --duration=10
ffmpeg -framerate 10 -i /tmp/frames/frame-%04d.jpg -c:v libx264 video.mp4
Notes
- •Chrome auto-starts in headless mode when needed (use
--no-headlessfor visible browser) - •Chrome continues running between commands for speed
- •Use
--user-data=PATHto preserve login/cookies between sessions - •Use
pkill -f 'chrome.*--remote-debugging-port'to stop Chrome manually - •Default port is 9222; set
CHROME_DRIVER_PORTto change - •All scripts support
--helpfor full usage info - •Large screenshots are auto-scaled to fit within 8000px (API limit)
Session Management Tips
For sites requiring login (Instacart, Amazon, banking, etc.):
- •Create a persistent profile: Use
--user-data=~/.chrome-sitenameconsistently - •First-time login: Use
--no-headlessto see the browser and log in manually - •Subsequent automation: Same
--user-datapath preserves your session - •Multiple accounts: Use different
--user-datapaths for different accounts/sites
Multi-Step Workflow Guide: Shopping Sites (Instacart, Amazon, etc.)
This section walks through automating complex multi-step processes on sites that require login and multiple interactions. The key insight is using two commands together:
- •
navigate- Go to a URL (use for initial navigation) - •
interact- Work with the current page without reloading (use for everything after)
Understanding the Two-Command Pattern
Why navigate alone isn't enough:
- •
navigateALWAYS loads/reloads the specified URL - •Each
navigatecall is independent - it doesn't "continue" from the previous state - •Multi-step workflows (search → select → add to cart) break because each step resets
Why interact is essential:
- •
interactconnects to the EXISTING browser tab - •It performs actions on whatever page is currently displayed
- •The page state is preserved between
interactcalls - •Perfect for SPAs and multi-step processes
Complete Example: Adding Items to Instacart Cart
Here's a real-world example that works:
Step 1: Initial Setup - Create Profile and Login
First time only - open browser visibly so user can log in:
# Open Instacart with visible browser and persistent profile
${CLAUDE_PLUGIN_ROOT}/bin/navigate https://www.instacart.com \
--no-headless \
--user-data=~/.chrome-instacart
# USER ACTION: Log in manually in the browser window
# Your cookies/session will be saved to ~/.chrome-instacart
Step 2: Navigate to Store
After login, go to the desired store:
# First, find the correct store URL by checking the homepage
${CLAUDE_PLUGIN_ROOT}/bin/navigate https://www.instacart.com/store \
--no-headless \
--user-data=~/.chrome-instacart
# Use JavaScript to find links containing your store name
${CLAUDE_PLUGIN_ROOT}/bin/interact \
--eval="Array.from(document.querySelectorAll('a')).filter(a => a.innerText.includes('Market 32')).map(a => a.href).join('\n')" \
--no-headless \
--user-data=~/.chrome-instacart
# Navigate to the store (use the URL found above)
${CLAUDE_PLUGIN_ROOT}/bin/navigate https://www.instacart.com/store/price-chopper-ny/storefront \
--no-headless \
--user-data=~/.chrome-instacart
Step 3: Search for Product (Multi-Step with interact)
Now use interact to search WITHOUT navigating away:
# Click the search input
${CLAUDE_PLUGIN_ROOT}/bin/interact \
--click="#search-bar-input" \
--no-headless \
--user-data=~/.chrome-instacart
# Type the search query
${CLAUDE_PLUGIN_ROOT}/bin/interact \
--type="#search-bar-input=pork shoulder" \
--no-headless \
--user-data=~/.chrome-instacart
# Submit the search using JavaScript (trigger form submit event)
${CLAUDE_PLUGIN_ROOT}/bin/interact \
--eval="var input = document.querySelector('#search-bar-input'); var form = input.closest('form'); form.dispatchEvent(new Event('submit', {bubbles: true, cancelable: true})); 'submitted'" \
--no-headless \
--user-data=~/.chrome-instacart
Step 4: View Search Results
Check what products are available:
# Wait a moment for results to load, then check the page content
sleep 2
${CLAUDE_PLUGIN_ROOT}/bin/interact \
--eval="document.body.innerText.substring(0, 2000)" \
--no-headless \
--user-data=~/.chrome-instacart
Step 5: Add Product to Cart
Click the Add button for the desired product:
# Find and click the first "Add" button
${CLAUDE_PLUGIN_ROOT}/bin/interact \
--eval="var addBtns = Array.from(document.querySelectorAll('button')).filter(b => b.innerText.trim() === 'Add'); if (addBtns.length > 0) { addBtns[0].click(); 'Added to cart'; } else { 'No Add button found'; }" \
--no-headless \
--user-data=~/.chrome-instacart
Step 6: Verify Cart Updated
Confirm the item was added:
# Check cart count or page state
${CLAUDE_PLUGIN_ROOT}/bin/interact \
--eval="document.body.innerText.includes('2') ? 'Cart has 2 items' : document.body.innerText.substring(0, 500)" \
--no-headless \
--user-data=~/.chrome-instacart
Key Techniques for Multi-Step Automation
1. Finding Elements with JavaScript
When CSS selectors are complex or dynamic, use --eval to find elements:
# Find all buttons with specific text
--eval="Array.from(document.querySelectorAll('button')).filter(b => b.innerText.includes('Add')).length"
# Find links by text content
--eval="Array.from(document.querySelectorAll('a')).filter(a => a.innerText.includes('Checkout'))[0].href"
# Get input field IDs/names
--eval="JSON.stringify(Array.from(document.querySelectorAll('input')).map(i => ({id: i.id, name: i.name, placeholder: i.placeholder})))"
2. Clicking Elements via JavaScript
When --click doesn't work (dynamic elements, overlays, etc.):
# Click by finding element with JS
--eval="document.querySelector('.add-to-cart-btn').click(); 'clicked'"
# Click first matching element
--eval="Array.from(document.querySelectorAll('button')).filter(b => b.innerText === 'Add')[0].click(); 'clicked'"
3. Submitting Forms
Different approaches for form submission:
# Method 1: Dispatch submit event (most reliable for SPAs)
--eval="document.querySelector('form').dispatchEvent(new Event('submit', {bubbles: true, cancelable: true})); 'submitted'"
# Method 2: Click submit button
--click="button[type=submit]"
# Method 3: Call submit() directly
--eval="document.querySelector('form').submit(); 'submitted'"
4. Waiting for Dynamic Content
Use --wait-for or JavaScript polling:
# Wait for element to appear
--wait-for=".search-results"
# Or use JavaScript with timeout
--eval="new Promise(r => { let i = setInterval(() => { if (document.querySelector('.results')) { clearInterval(i); r('found'); }}, 100); setTimeout(() => { clearInterval(i); r('timeout'); }, 5000); })"
5. Reading Page State
Extract information to make decisions:
# Get current URL
--eval="window.location.href"
# Check if logged in
--eval="document.body.innerText.includes('Sign Out') ? 'logged in' : 'not logged in'"
# Get cart count
--eval="document.querySelector('.cart-count')?.innerText || '0'"
# Get product prices
--eval="JSON.stringify(Array.from(document.querySelectorAll('.price')).map(p => p.innerText))"
Troubleshooting Multi-Step Workflows
Problem: Page resets between commands
Solution: Use interact instead of navigate for all steps after initial navigation.
Problem: Elements not found
Solution:
- •Use
--evalto inspect the page structure - •Wait for dynamic content with
--wait-fororsleep - •Check if element is in an iframe
Problem: Click doesn't work
Solution:
- •Try JavaScript click:
--eval="document.querySelector('...').click()" - •Scroll element into view first
- •Check for overlays blocking the element
Problem: Form doesn't submit
Solution:
- •Use form submit event:
form.dispatchEvent(new Event('submit', ...)) - •Check if it's a SPA with custom form handling
- •Look for and click the actual submit button
Problem: Login session lost
Solution:
- •Always use same
--user-datapath - •Check if cookies expired
- •Some sites require re-login periodically
Best Practices
- •Always use
--user-datafor sites requiring login - •Use
--no-headlessduring development to see what's happening - •Start with
navigateto load the initial page - •Switch to
interactfor all subsequent actions on that page - •Use
--evalliberally to inspect page state and find elements - •Add
sleepbetween actions if the site needs time to update - •Check results after each step to verify the action worked