browser-use (local) playbook
Default constraints in this environment
- •Prefer browser-use (CLI/Python) over OpenClaw
browsertool here; OpenClawbrowsermay fail if no supported system browser is present. - •Use persistent sessions to do multi-step flows:
--session <name>.
Quick CLI workflow (non-agent)
- •Open
bash
browser-use --session demo open https://example.com
- •Inspect (sometimes
statereturns 0 elements on heavy/JS sites)
bash
browser-use --session demo --json state | jq '.data | {url,title,elements:(.elements|length)}'
- •Screenshot (always works; best debugging primitive)
bash
browser-use --session demo screenshot /home/node/.openclaw/workspace/page.png
- •HTML for link discovery (works even when
stateis empty)
bash
browser-use --session demo --json get html > /tmp/page_html.json
python3 - <<'PY'
import json,re
html=json.load(open('/tmp/page_html.json')).get('data',{}).get('html','')
urls=set(re.findall(r"https?://[^\s\"'<>]+", html))
for u in sorted([u for u in urls if any(k in u for k in ['demo','login','console','qr','qrcode'])])[:200]:
print(u)
PY
- •Lightweight DOM queries via JS (useful when
stateis empty)
bash
browser-use --session demo --json eval "location.href" browser-use --session demo --json eval "document.title"
Agent workflow with OpenAI-compatible LLM (Moonshot/Kimi)
Use Python for Agent runs when the CLI run path requires Browser-Use cloud keys or when you need strict control over LLM parameters.
Minimal working Kimi example
Create .env (or export env vars) with:
- •
OPENAI_API_KEY=... - •
OPENAI_BASE_URL=https://api.moonshot.cn/v1
Then run the bundled script:
bash
source /home/node/.openclaw/workspace/.venv-browser-use/bin/activate python /home/node/.openclaw/workspace/skills/browser-use-local/scripts/run_agent_kimi.py
Kimi/Moonshot quirks observed in practice (fixes):
- •
temperaturemust be1forkimi-k2.5. - •
frequency_penaltymust be0forkimi-k2.5. - •Moonshot can reject strict JSON Schema used for structured output. Enable:
- •
remove_defaults_from_schema=True - •
remove_min_items_from_schema=True
- •
If you get a 400 error mentioning response_format.json_schema ... keyword 'default' is not allowed or min_items unsupported, those two flags are the first thing to set.
QR code extraction (login/demo pages)
Preferred order
- •Screenshot the page and crop candidate regions (fast, robust).
- •If HTML contains
data:image/png;base64,..., extract and decode it.
Crop candidates
Use scripts/crop_candidates.py to generate multiple likely QR crops from a screenshot.
bash
source /home/node/.openclaw/workspace/.venv-browser-use/bin/activate python skills/browser-use-local/scripts/crop_candidates.py \ --in /home/node/.openclaw/workspace/login.png \ --outdir /home/node/.openclaw/workspace/qr_crops
Extract base64-embedded images from HTML
bash
source /home/node/.openclaw/workspace/.venv-browser-use/bin/activate browser-use --session demo --json get html > /tmp/page_html.json python skills/browser-use-local/scripts/extract_data_images.py \ --in /tmp/page_html.json \ --outdir /home/node/.openclaw/workspace/data_imgs
Troubleshooting
- •
stateshowselements: 0: useget html+ regex discovery, plus screenshots; useevalto query DOM. - •Page readiness timeout warnings: usually harmless; rely on screenshot + HTML.
- •CLI flags order: global flags go before the subcommand:
- •✅
browser-use --browser chromium --json open https://... - •❌
browser-use open https://... --browser chromium
- •✅