Web Scraper (Advanced)
Converts a webpage to Markdown for easier reading by the LLM.
Requirements
- •Tool:
r.jina.ai(Free Reader API) - No installation needed, just usecurl.
Cross-Platform Method (Python)
Works on Windows, Linux, and Mac.
- •Run Script:
bash
python workspace/skills/scraper/scripts/scrape.py "https://example.com"
Commands (Bash/Linux/Mac)
Read Page (Markdown)
Fetches the URL and converts it to clean Markdown.
bash
curl -s "https://r.jina.ai/https://example.com"
Read Page (Text Only)
Fetches the URL and returns plain text.
bash
curl -s -H "Accept: text/plain" "https://r.jina.ai/https://example.com"
Usage
Ghost uses this to "read" documentation, news articles, or blog posts that are otherwise too cluttered with HTML/JS for simple analysis.