Scrape Novel Chapters
Download raw Chinese chapters using the project's scraper script.
Arguments
- •
$ARGUMENTSshould be a URL to a novel's chapter list page - •Example:
/scrape https://m.shuhaige.net/novel/12345/
Supported Sites
The scraper (scripts/scraper.py) supports:
- •shuhaige.net (m.shuhaige.net)
- •novel543.com
- •wxdzs.net
- •jpxs123.com
- •wfxs.tw
- •uukanshu.cc
Steps
- •Validate the URL matches a supported site
- •Run the scraper:
bash
python scripts/scraper.py "$ARGUMENTS"
- •Report results: Number of chapters downloaded, output location, any errors
- •Suggest next steps: Remind user to organize files into
raw/<novel-name>/if the scraper outputs elsewhere
Notes
- •The scraper adds a 1.5-second delay between requests to avoid blocking
- •Chapters are saved as UTF-8
.txtfiles - •Errors are logged to
errors.login the output folder - •If a download fails partway through, the scraper can be re-run — it overwrites existing files