docsearch Option to keep previous record when JS page times out

Describe the problem

Currently, the crawler regularly fails on one of my docs pages because it doesn't load fast enough. When this happens, the crawler reports a failure and seems to delete the records extracted during the last successful crawl, meaning my page is not indexed at all.

Describe the solution

I'd like to be able to specify what to do in case of timeout for example with an option like deleteRecordsOnFailure: false.

Feb 06 '23 12:02 ArthurFlag

Any update on this possiblity? I have to restart my crawl manually for this page daily 😬

Feb 24 '23 12:02 ArthurFlag

Hey @ArthurFlageul, let me check to see if there's an option that can help here.

Feb 24 '23 21:02 shaneafsar

Hiya @shaneafsar , any update on this 😊 ?

Mar 01 '23 11:03 ArthurFlag

Hey @ArthurFlageul , thanks for the reminder. A couple comments:

Putting maxLostRecordsPercentage to 0 would block the crawl if any of the pages has failed. But, you'll have to manually unblock your crawler, and it will fail too if you remove content on purpose.
To improve load times, would it be possible to disable JS rendering on the crawler via renderJavaScript (if you haven't tried already)?

Mar 01 '23 16:03 shaneafsar

Closing this issue.

Jul 10 '24 22:07 randombeeper