crawlee
crawlee copied to clipboard
Unify timeouts throughout our classes
PuppeteerCrawler
-
gotoFunctionhas a constant timeout inside, which can be overridden by overriding the function. -
handlePageFunctionhas its own timeout.
CheerioCrawler
-
prepareRequestFunctiondoes not have a timeout. -
handlePageFunctionhas its own timeout.
BasicCrawler
-
handleRequestFunctionhas its own timeout. When usingPuppeteerorCheerio, the timeout is set to a multiple of theirhandlePageFunction. -
handleFailedRequestFunctiondoes not have a timeout.
AutoscaledPool
- has no timeouts.
PuppeteerPool
- has
puppeteerOperationsTimeoutSecsfor puppeteer related stuff.
It's a mess.