gpt-crawler icon indicating copy to clipboard operation
gpt-crawler copied to clipboard

WARN PlaywrightCrawler: Reclaiming failed request back to the list or queue. Request blocked - received 429 status code.

Open Voyager3D opened this issue 2 years ago • 2 comments

I'm no coder and i've not scraped websites before. But i'm assuming that this error code might be the website denying me scraping it too much?

I was able to output a file from this website after it scanned 150 pages. Worked perfectly, but somewhere after 150 it does not seem to like it and i get this error: WARN PlaywrightCrawler: Reclaiming failed request back to the list or queue. Request blocked - received 429 status code.

Not sure if im on the ball with that one or not, but any advice would be appreciated!

Cheers!

Voyager3D avatar Jan 07 '24 14:01 Voyager3D

Hi, I'm having the same issue with several websites. Is it possible to add a sleep option between two calls? I don't see any other possibilities. Thanks a lot!

Cougart avatar Jan 16 '24 18:01 Cougart

429 being "the too many requests" status code you may have been throttled by the server.

Meaning: to prevent people from making too many requests they block requests coming from a given IP either temporarily or permanently after a given amount of incoming requests. Not saying this is 100% your case but that's the most probable scenario here.

SimonGodefroid avatar Feb 12 '24 06:02 SimonGodefroid