Skill: Browser
Use this skill when you need live browser automation during a Factory session—opening sites, clicking through flows, gathering structured data, or capturing screenshots.
Inputs
- •Target URL or task description (natural language)
- •Optional structured extraction schema (JSON field → type)
Behavior
- •Ensure Chrome is running via Stagehand with the local profile stored in
.chrome-profile. - •Support these commands:
- •
navigate <url>: open a page and capture a screenshot. - •
act "<instruction>": perform natural-language actions. - •
extract "<instruction>" '{"field":"type"}': return structured data. - •
observe "<goal>": list suggested steps the agent can take. - •
screenshot: capture the current viewport. - •
close: shut down the session when finished.
- •
- •Save screenshots to
agent/browser_screenshotsand report the file path in the response. - •When tasks finish, summarize what happened plus any follow-up steps for the user.
Verification
- •If a navigation/action fails, include the error message and prompt the user for next steps.
- •Before ending the session, ensure
closehas been run so Chrome processes don’t linger.
Notes
- •This skill expects
ANTHROPIC_API_KEY(for Stagehand) and, if used viadroid exec,FACTORY_API_KEYto already be configured. - •The working directory should remain inside the cloned skill folder so relative paths resolve correctly.