Fetch Web Intel

Name: fetch-web-intel
Rating: 76
Author: amit-sw

Purpose

Plan and execute web data collection tasks via the hosted Fetch MCP server registered in servers/fetch. Ideal for capturing article summaries, metadata, or targeted DOM fragments.

Setup Checklist

•Verify FETCH_MCP_ENDPOINT / FETCH_MCP_API_KEY exist and the endpoint is listed in mcp.json.
•Confirm the remote server’s allow list includes the target domains; update the provider portal if not.
•Decide on user agent overrides when sites block generic bots.

Workflow

•Scope – list URLs plus extraction goals (full page HTML, CSS selector, sitemap traversal). Batch by domain to leverage connection reuse.
•Execute – call the MCP fetch, scrape, or search tools. Supply timeout hints for heavier pages.
•Normalize – clean the payload (strip scripts, collapse whitespace) before storing or handing to other skills.
•Log – capture HTTP status and rate-limit headers so we can backoff or retry intelligently.

Notes

•Respect robots: if the server returns blockedByRobots, inform the user instead of retrying.
•Cap parallel requests to 3 per domain to avoid provider throttling.
•Store extracted outputs under docs/intel/ through the filesystem skill if they need persistence beyond the current run.