AgentSkillsCN

web-fetch

使用trafilatura将网页内容抓取为Markdown格式。适用于子代理或代理需要阅读网页内容,但无法使用web_fetch工具的场景。只需通过Shell配合Python一行命令即可完成。

SKILL.md
--- frontmatter
name: web-fetch
description: Fetch web page content as Markdown using trafilatura. Use when sub-agents or agents need to read web page content but don't have access to the web_fetch tool. Works via shell with a Python one-liner.

Web Fetch (trafilatura)

Fetch web page content as Markdown text. Replacement for web_fetch tool in contexts where it's unavailable (e.g., sub-agents).

Usage

bash
python -c "import trafilatura; print(trafilatura.extract(trafilatura.fetch_url('URL'), output_format='markdown') or '')"

Notes

  • Returns empty string if extraction fails
  • Automatically removes navigation, footers, sidebars
  • Requires trafilatura package (in requirements.txt)