scrape

使用scraper.py从支持的小说站点下载中文原文章节。

SKILL.md

--- frontmatter

name: scrape
description: Download raw Chinese chapters from supported novel sites using scraper.py.
argument-hint: "[url]"
disable-model-invocation: true

Scrape Novel Chapters

Download raw Chinese chapters using the project's scraper script.

Arguments

•$ARGUMENTS should be a URL to a novel's chapter list page
•Example: /scrape https://m.shuhaige.net/novel/12345/

Supported Sites

The scraper (scripts/scraper.py) supports:

•shuhaige.net (m.shuhaige.net)
•novel543.com
•wxdzs.net
•jpxs123.com
•wfxs.tw
•uukanshu.cc

Steps

•Validate the URL matches a supported site
•
Run the scraper:
bash
```
python scripts/scraper.py "$ARGUMENTS"
```
•Report results: Number of chapters downloaded, output location, any errors
•Suggest next steps: Remind user to organize files into raw/<novel-name>/ if the scraper outputs elsewhere

Notes

•The scraper adds a 1.5-second delay between requests to avoid blocking
•Chapters are saved as UTF-8 .txt files
•Errors are logged to errors.log in the output folder
•If a download fails partway through, the scraper can be re-run — it overwrites existing files