autorest icon indicating copy to clipboard operation
autorest copied to clipboard

Retry network errors

Open pakrym opened this issue 4 years ago • 14 comments

In the .NET repo we run a lot of instances of autorest in parallel.

SOmetimes they fail with

EXEC : error : read ECONNRESET
      at TLSWrap.onStreamRead (internal/stream_base_commons.js:209:20)

We should retry errors like these as they are often transient.

pakrym avatar Oct 04 '21 15:10 pakrym

Sample failure:

https://dev.azure.com/azure-sdk/internal/_build/results?buildId=1125565&view=logs&j=b70e5e73-bbb6-5567-0939-8415943fadb9&t=a880e989-7d1a-5c96-a41f-d540b383cc43

pakrym avatar Oct 04 '21 15:10 pakrym

Could that be the underlying issue? https://github.com/nodejs/node/issues/35824 Seems to have been a bug in node 14 resolved in node 15+. Also not sure what code is the source of this. AutroRest doesn't directly use this and this seems it would be caused by a "server" connection not a client

timotheeguerin avatar Oct 04 '21 15:10 timotheeguerin

Hm, maybe.

cc @AlexanderSher we might need to update node across the board.

pakrym avatar Oct 04 '21 15:10 pakrym

Node 16 will become the new LTS version Oct 26th https://nodejs.dev/download

timotheeguerin avatar Oct 04 '21 16:10 timotheeguerin

Is it supported by autorest?

pakrym avatar Oct 04 '21 16:10 pakrym

Yeah, the integration tests are running against it.

timotheeguerin avatar Oct 04 '21 16:10 timotheeguerin

We might want to remove https://github.com/Azure/autorest/blob/6f0751d1f83617375d212411c5674d670bb9ffa0/packages/apps/autorest/entrypoints/app.js#L40 then

pakrym avatar Oct 04 '21 19:10 pakrym

Update to node 16 didn't help.

  
  AutoRest code generation utility [cli version: 3.4.1; node: v16.13.0]
  (C) 2018 Microsoft Corporation.
  https://aka.ms/autorest
  (node:15376) [DEP0005] DeprecationWarning: Buffer() is deprecated due to security and usability issues. Please use the Buffer.alloc(), Buffer.allocUnsafe(), or Buffer.from() methods instead.
  (Use `node --trace-deprecation ...` to show where the warning was created)
  info    | AutoRest core version selected from configuration: 3.6.6.
EXEC : error : read ECONNRESET [/mnt/vss/_work/1/s/sdk/network/Azure.ResourceManager.Network/src/Azure.ResourceManager.Network.csproj]
      at TLSWrap.onStreamRead (node:internal/stream_base_commons:220:20)

pakrym avatar Nov 20 '21 07:11 pakrym

This makes .NET builds very flaky. I had pipeline fail with this error twice in a row just now.

pakrym avatar Nov 22 '21 17:11 pakrym

Did you ever repro this locally?

timotheeguerin avatar Nov 22 '21 19:11 timotheeguerin

Nope, wasn't able to. Only time see it is in core PR that generates a lot of libraries in parallel

pakrym avatar Nov 22 '21 19:11 pakrym

Im really don't know where this could be happening, when reading files it will catch errors and retry already. Any chance you could

  1. update autorest cli to latest (3.5.1)
  2. separate the output of the parallel runs in different files so can give an idea of when it is happening
  3. see if you can add --debug --verbose

timotheeguerin avatar Nov 22 '21 19:11 timotheeguerin

Looking at builds it seems to always fail during the ResourceManager.Network generation.

@AlexanderSher can we pick up the latest Autorest?

pakrym avatar Nov 22 '21 21:11 pakrym

Sorry to reopen this old issue but this issue is not fixed and as we scale across the board we are hitting very often in all languages running autorest codegen. We need to get to the bottom of this issue and ensure we are correctly retrying or fixing any potential race conditions that is causing this issue.

weshaggard avatar Feb 10 '23 16:02 weshaggard