LoggedWebText

Fetch a web page, extract readable plain text, and log to datastore.

Usage

python LoggedWebText.py --url "<URL>" [--max-chars 50000] [--timeout-ms 15000]

Returns JSON with fields:

Extracted text is saved to: {OPENCLAW_OUTPUT_ROOT}/datastore/YYYY/MM/DD/{PageName}.txt

If OPENCLAW_OUTPUT_ROOT is not set, defaults to ~/.openclaw/data

Fetch a page: python LoggedWebText.py --url "https://www.example.com"

Limit to 10,000 characters: python LoggedWebText.py --url "https://www.example.com" --max-chars 10000

Custom 30-second timeout: python LoggedWebText.py --url "https://www.example.com" --timeout-ms 30000