box-node-sdk icon indicating copy to clipboard operation
box-node-sdk copied to clipboard

econnreset failure is not handled by SDK retry logic

Open lukesiler opened this issue 5 years ago • 6 comments

  • [ x] I have checked that the SDK documentation doesn't solve my issue.
  • [ x] I have checked that the API documentation doesn't solve my issue.
  • [ x] I have searched the Box Developer Forums and my issue isn't already reported (or if it has been reported, I have attached a link to it, for reference).
  • [ x] I have searched Issues in this repo and my issue isn't already reported.

Description of the Issue

We use a few files and folders API's. Each of them can fail with ECONNRESET when a TCP connection is lost or reset. The built-in retry logic does not handle these communication failures so we have to write our own retry logic which works fine.

Steps to Reproduce

  1. Run the following API calls consistently over several days
  • files.get
  • files.getReadStream
  • folders.getItems

Expected Behavior

Built-in request retry logic should catch and retry ECONNRESET

if (boxErr && boxErr.code && boxErr.code === 'ECONNRESET') { // retry request }

Error Message, Including Stack Trace

{ code: "ECONNRESET" message: "socket hang up" name: "Error" stack: "Error: socket hang up at connResetException (internal/errors.js:570:14) at TLSSocket.socketOnEnd (_http_client.js:440:23) at TLSSocket.emit (events.js:215:7) at endReadableNT (_stream_readable.js:1184:12) at processTicksAndRejections (internal/process/task_queues.js:80:21)" }

Versions Used

Node SDK: 1.32.0

lukesiler avatar Jun 10 '20 15:06 lukesiler

Hi @lukesiler ,

Thanks for submitting this Issue! We will take a look and get back to you ASAP!

You mentioned that you have already implemented your own retry logic? I assume that's in a fork of this SDK? Maybe you could submit a formal PR with that change and it would expedite the fix for this since we could comment on and/or approve it directly.

Thanks,

@PJSimon

PJSimon avatar Jun 10 '20 16:06 PJSimon

hi @PJSimon

My retry logic is in our application code that handles the error returned and makes the box-node-sdk call again which is successful. I did not see an obvious place in box-node-sdk request code flow to introduce this likely because I'm not very familiar with that code.

thanks, @lukesiler

lukesiler avatar Jun 10 '20 16:06 lukesiler

OK, sounds good, @lukesiler ! Thanks for that info. We'll get back to you on this.

PJSimon avatar Jun 10 '20 20:06 PJSimon

@PJSimon - another interesting note here. I've been chasing an intermittent bug in files.getReadStream where it sometimes just stalls out. This turns out to be caused by an ECONNRESET that occasionally happens.

err: { code: "ECONNRESET" message: "socket hang up" name: "Error" stack: "Error: socket hang up at connResetException (internal/errors.js:570:14) at TLSSocket.socketOnEnd (_http_client.js:440:23) at TLSSocket.emit (events.js:215:7) at endReadableNT (_stream_readable.js:1184:12) at processTicksAndRejections (internal/process/task_queues.js:80:21)" }

It would be great to get a fix that transparently retries in these cases.

lukesiler avatar Jun 17 '20 00:06 lukesiler

hi @PJSimon,

I also found that 503 error is not being handled by built-in retry code.

err: { message: "503 - Service Unavailable" name: "Error" stack: "Error: 503 - Service Unavailable at APIRequest._handleResponse (/opt/pryon/contapp/node_modules/box-node-sdk/lib/api-request.js:234:9) at Request.self.callback (/opt/pryon/contapp/node_modules/request/request.js:185:22) at Request.emit (events.js:210:5) at Request.<anonymous> (/opt/pryon/contapp/node_modules/request/request.js:1161:10) at Request.emit (events.js:210:5) at IncomingMessage.<anonymous> (/opt/pryon/contapp/node_modules/request/request.js:1083:12) at Object.onceWrapper (events.js:299:28) at IncomingMessage.emit (events.js:215:7) at endReadableNT (_stream_readable.js:1184:12) at processTicksAndRejections (internal/process/task_queues.js:80:21)" }

It would be great if sdk retry handled both ECONNRESET and 503.

thanks, @lukesiler

lukesiler avatar Jun 28 '20 16:06 lukesiler

Anything update on this?

mattcobb avatar Aug 25 '21 21:08 mattcobb

This issue has been automatically marked as stale because it has not been updated in the last 30 days. It will be closed if no further activity occurs within the next 7 days. Feel free to reach out or mention Box SDK team member for further help and resources if they are needed.

stale[bot] avatar Dec 19 '22 19:12 stale[bot]

This issue has been automatically closed due to maximum period of being stale. Thank you for your contribution to Box Node SDK and feel free to open another PR/issue at any time.

stale[bot] avatar Dec 26 '22 20:12 stale[bot]