goscrape icon indicating copy to clipboard operation
goscrape copied to clipboard

Web scraper that can create an offline readable version of a website

Results 20 goscrape issues
Sort by recently updated
recently updated
newest added

URL: Log: ``` 2021-06-29T06:10:25.020Z INFO External URL {"URL": "data:image/gif;base64,R0lGODlhAQABAAD/ACwAAAAAAQABAAACADs%3D"} 2021-06-29T06:10:25.025Z INFO Downloading {"URL": "data:image/gif;base64,R0lGODlhAQABAAD/ACwAAAAAAQABAAACADs%3D"} 2021-06-29T06:10:25.031Z ERROR Scraping failed {"error": "Get \"data:image/gif;base64,R0lGODlhAQABAAD/ACwAAAAAAQABAAACADs%3D\": unsupported protocol scheme \"data\""} ```

Example site :- origami.guide This tool is unable to get all images of this site and similar sites.

bug

`example.org/cdn-cgi/styles/fonts/opensans-600.svg#open_sanssemibold`

For example: ``` https://www.example.com/category/blog-post/ https://www.example.com/category/blog-post ```

enhancement

this will fix some problems like images referenced in css

* update goreleaser action to latest (v5) * add docker build to goreleaser config * add docker login step to release workflow to authenticate to GHCR * add Dockerfile

Simple scrape started, after about 55 minutes, crash with OOM from Linux kernel. Using 3.5GB, on a 4GB machine. My guess, this is because entire "queue" of what to download,...

bug