crawlee icon indicating copy to clipboard operation
crawlee copied to clipboard

Add option to deactivate auto-retire when proxy is blocked

Open rubydev opened this issue 4 years ago • 2 comments

Describe the feature In Playwright/Puppeteer crawler, when response is for example 403, crawler automatically throw Error: Request blocked - received 403 status code.. Please add option to disable this functionality (throwOnBlockedRequest).

Motivation I would like to handle blocked requests myself. For example I would like to count number of blocked requests, for calculating statistics and blocking ratio.

Constraints No Constraints.

Thanks!

rubydev avatar Nov 23 '21 10:11 rubydev

I think you can do that in the postNavigationHooks. So, for example, It is possible to solve a blocking challenge, wait for the redirect and reassign the response in the crawlingContext to the new response from the redirect.

petrpatek avatar Nov 23 '21 13:11 petrpatek

That makes sense, thank you!

rubydev avatar Nov 30 '21 14:11 rubydev

Closing as the blocked status codes are nowadays configurable

https://crawlee.dev/api/core/interface/SessionPoolOptions#blockedStatusCodes

B4nan avatar Jul 17 '23 15:07 B4nan