Fetch Web Intel
Purpose
Plan and execute web data collection tasks via the hosted Fetch MCP server registered in servers/fetch. Ideal for capturing article summaries, metadata, or targeted DOM fragments.
Setup Checklist
- •Verify
FETCH_MCP_ENDPOINT/FETCH_MCP_API_KEYexist and the endpoint is listed inmcp.json. - •Confirm the remote server’s allow list includes the target domains; update the provider portal if not.
- •Decide on user agent overrides when sites block generic bots.
Workflow
- •Scope – list URLs plus extraction goals (full page HTML, CSS selector, sitemap traversal). Batch by domain to leverage connection reuse.
- •Execute – call the MCP
fetch,scrape, orsearchtools. Supply timeout hints for heavier pages. - •Normalize – clean the payload (strip scripts, collapse whitespace) before storing or handing to other skills.
- •Log – capture HTTP status and rate-limit headers so we can backoff or retry intelligently.
Notes
- •Respect robots: if the server returns
blockedByRobots, inform the user instead of retrying. - •Cap parallel requests to 3 per domain to avoid provider throttling.
- •Store extracted outputs under
docs/intel/through the filesystem skill if they need persistence beyond the current run.