AgentSkillsCN

puppeteer-fetch

采用无头浏览器并启用隐身模式的备用网页抓取器。仅在 web_fetch 因 403 错误、验证码拦截、访问被拒绝、内容为空,或出现“启用 JavaScript”等错误而失败时才使用此工具。切勿预先启用——始终优先尝试 web_fetch。

SKILL.md
--- frontmatter
name: puppeteer-fetch
description: Fallback web fetcher using headless browser with stealth mode. Use ONLY after web_fetch fails with 403, captcha, access denied, empty content, or "enable JavaScript" errors. Never use preemptively — always try web_fetch first.

Puppeteer Fetch

Fallback tool when web_fetch fails due to bot detection or JavaScript requirements.

Decision Flow

code
1. User requests URL content
2. Try web_fetch first (ALWAYS)
3. If web_fetch succeeds → done
4. If web_fetch fails with blocking indicators → use puppeteer-fetch:fetch_page

Trigger Conditions (ONLY after web_fetch failure)

Use puppeteer-fetch when web_fetch returns:

  • 403 Forbidden
  • Captcha or challenge page content
  • "Please enable JavaScript" or similar
  • Cloudflare/bot protection page
  • Empty or truncated content from JS-heavy sites
  • Access denied or blocked messages

Never Use When

  • web_fetch hasn't been tried yet
  • web_fetch succeeded (even partially)
  • User just wants a simple search (use web_search)
  • Preemptively "just in case" — always try web_fetch first

Tool Parameters

code
puppeteer-fetch:fetch_page
├── url (required): Full URL to fetch
├── timeout (optional): ms, default 30000
└── waitFor (optional): CSS selector to wait for before extracting

Output

Returns markdown (not HTML).