Web Fetch
Overview
Extract clean, readable content from any URL using Jina Reader API. Returns raw JSON with title, content, and metadata optimized for LLM consumption.
When to Use
- •User wants to read or analyze webpage content
- •Need to extract article text from a URL
- •Fetching documentation or reference pages
- •Converting web pages to clean text for processing
Workflow
- •Identify the URL from user request
- •Validate URL format
- •Run the fetch script
- •Present extracted content to user
Usage
bash
# Basic fetch uv run --script scripts/web_fetch.py --url "https://example.com" # With custom timeout uv run --script scripts/web_fetch.py \ --url "https://example.com/article" \ --timeout 60
Parameters
| Parameter | Default | Description |
|---|---|---|
--url | (required) | URL to fetch and extract content from |
--timeout | 30 | Request timeout in seconds |
Output Contract
| Scenario | stdout | stderr | exit code |
|---|---|---|---|
| Success | Raw JSON from Jina | (empty) | 0 |
| Invalid URL | (empty) | Error message | 1 |
| Timeout | (empty) | Timeout error | 1 |
| HTTP Error | (empty) | HTTP error details | 1 |
Success output contains:
- •Page title and description
- •Clean extracted content (markdown-formatted)
- •URL and metadata
- •Token usage information
Prerequisites
- •Uses Jina Reader API (no API key required)
- •Requires
uvfor running PEP 723 scripts
Examples
Fetch a webpage
bash
uv run --script scripts/web_fetch.py \ --url "https://docs.python.org/3/whatsnew/3.12.html"
Fetch with longer timeout for slow pages
bash
uv run --script scripts/web_fetch.py \ --url "https://example.com/large-article" \ --timeout 60