Scrape Website for Static Hosting
When the user wants to scrape/parse/clone a website, follow these steps:
- •Download with wget:
bash
wget --recursive --page-requisites --adjust-extension --convert-links --span-hosts --domains=$DOMAIN $ARGUMENTS
- •
Fix relative paths in HTML files for assets (CSS, JS, images)
- •
Handle AJAX/infinite scroll - disable or implement static alternative
- •
Fix broken functionality:
- •Remove server-dependent features
- •Fix image lazy loading
- •Show pagination instead of infinite scroll
- •
Create fix.js and fix.css if needed to patch issues
Arguments: $ARGUMENTS (the URL to scrape)