undici icon indicating copy to clipboard operation
undici copied to clipboard

fetch fail: cause: AggregateError [ETIMEDOUT]:

Open guotie opened this issue 1 year ago • 31 comments

Bug Description

my code is typescript. I run it in two ways:

  1. use bun
  2. compile to js, and use node v21

when use bun run it, no error occurs; when use node run it, it occurs fetch failed extremely frequent

the request is an apollo graphql request. proxy is local http proxy:

https_proxy=http://127.0.0.1:7890 http_proxy=http://127.0.0.1:7890 all_proxy=socks5://127.0.0.1:7890

Logs & Screenshots

ApolloError: fetch failed
    at new ApolloError (/Users/guotie/guotie/product/ZUGGER/poolapi/node_modules/@apollo/client/errors/errors.cjs:33:28)
    at /Users/guotie/guotie/product/ZUGGER/poolapi/node_modules/@apollo/client/core/core.cjs:2017:78
    at both (/Users/guotie/guotie/product/ZUGGER/poolapi/node_modules/@apollo/client/utilities/utilities.cjs:1347:31)
    at /Users/guotie/guotie/product/ZUGGER/poolapi/node_modules/@apollo/client/utilities/utilities.cjs:1338:72
    at new Promise (<anonymous>)
    at Object.then (/Users/guotie/guotie/product/ZUGGER/poolapi/node_modules/@apollo/client/utilities/utilities.cjs:1338:24)
    at Object.error (/Users/guotie/guotie/product/ZUGGER/poolapi/node_modules/@apollo/client/utilities/utilities.cjs:1349:49)
    at notifySubscription (/Users/guotie/guotie/product/ZUGGER/poolapi/node_modules/zen-observable/lib/Observable.js:140:18)
    at onNotify (/Users/guotie/guotie/product/ZUGGER/poolapi/node_modules/zen-observable/lib/Observable.js:179:3)
    at SubscriptionObserver.error (/Users/guotie/guotie/product/ZUGGER/poolapi/node_modules/zen-observable/lib/Observable.js:240:7) {
  graphQLErrors: [],
  protocolErrors: [],
  clientErrors: [],
  networkError: TypeError: fetch failed
      at node:internal/deps/undici/undici:12442:11
      at processTicksAndRejections (node:internal/process/task_queues:95:5)
      at runNextTicks (node:internal/process/task_queues:64:3)
      at listOnTimeout (node:internal/timers:540:9)
      at process.processTimers (node:internal/timers:514:7) {
    cause: AggregateError [ETIMEDOUT]: 
        at internalConnectMultiple (node:net:1116:18)
        at internalConnectMultiple (node:net:1184:5)
        at Timeout.internalConnectMultipleTimeout (node:net:1707:5)
        at listOnTimeout (node:internal/timers:575:11)
        at process.processTimers (node:internal/timers:514:7) {
      code: 'ETIMEDOUT',
      [errors]: [Array]
    }
  },
  extraInfo: undefined
}

Environment

Mac M1 Sonoma 14.2

Nodejs v21

Additional context

guotie avatar Feb 18 '24 05:02 guotie

Thanks for reporting!

Can you provide steps to reproduce? We often need a reproducible example, e.g. some code that allows someone else to recreate your problem by just copying and pasting it. If it involves more than a couple of different file, create a new repository on GitHub and add a link to that.

mcollina avatar Feb 18 '24 08:02 mcollina

repo is here: https://github.com/guotie/fetch-failed

I think the problem is timeout.

when the network is Ok, it rarely throw fetch failed; when the network is busy, error occurs more frequent.

guotie avatar Feb 18 '24 12:02 guotie

Facing a similar issue

smisra3 avatar Mar 14 '24 07:03 smisra3

same issue, anyone know the solution?

ahmadxgani avatar Apr 08 '24 13:04 ahmadxgani

if the internet connectivity was the issue, then why curl always work and never throw timeout?

ahmadxgani avatar Apr 08 '24 13:04 ahmadxgani

it seems this error only appear in certain environment or device? it's hard to reproduce but I think the bug really exists

someone already describe the same issue here too https://github.com/nodejs/undici/issues/2990

realyukii avatar Apr 08 '24 14:04 realyukii

or in certain host? I faced this issue when trying to fetch telegram api

ahmadxgani avatar Apr 08 '24 14:04 ahmadxgani

here's the endpoint that I try to fetch

curl "https://api.telegram.org/bot7003873933:AAFKl0LwWViMJIA34-qjbTh7nZwcNQr2hFs/getFile?file_id=CAACAgEAAxUAAWYT6IXJGTzY4S96PCbyqyO7fBXXAAIJEgACkweVC56njKMcTovTNAQ"

ahmadxgani avatar Apr 08 '24 14:04 ahmadxgani

Can you provide an Minimum Reproducible Example to support you better?

metcoder95 avatar Apr 09 '24 07:04 metcoder95

I've provided the Minimum Reproducible Example on my Gist, as you suggested.

If you have a strong or reliable internet connection, consider simulating slow connectivity to see if the error replicates. After all, ETIMEDOUT errors are more likely to occur under limited bandwidth conditions.

Interestingly, while fetch sometimes throws this error, curl seems to be able to avoid it in this scenario.

realyukii avatar Apr 10 '24 01:04 realyukii

Hmm, this does not seem like an undici error per se but rather a different way of handling connect timeouts.

The errors shown by the example and the roots of the issue mostly to the initial TCP connection (including TLS), meaning that undici timed out before the server could finish the initial connect operation.

You can attempt to extend the overall timeout while creating a custom Agent (See https://undici.nodejs.org/#/docs/api/Client?id=parameter-clientoptions) and test if that solves the timeout issue, which seems related directly to the network conditions.

As well, you can wrap it with the RetryAgent to automatically retry upon this errors.

metcoder95 avatar Apr 10 '24 08:04 metcoder95

in the docs it says:

bodyTimeout -  Defaults to 300 seconds.
headersTimeout -  Defaults to 300 seconds.

why it close the connection too early if the default was 300 seconds? does the node fetch use undici differently in the internal?

btw, this is unrelated question: why I can't access undici directly (require('undici')) if it was used in fetch implementation of node? is there a way to expose it, so I can use the RetryAgent class without installing undici dependency?

realyukii avatar Apr 10 '24 09:04 realyukii

The timeouts you mention are applied directly at http level, meanwhile the timeout I'm referring to, is linked to the TCP handshake; which by default is 10s (lower than the body and headers).

Sadly no, you'll need to install undici to make use of the RetryAgent.

metcoder95 avatar Apr 10 '24 19:04 metcoder95

Thanks for your assistance! It's confirmed that increasing the connection timeout resolved the issue in my scenario ^^

realyukii avatar Apr 11 '24 10:04 realyukii

I am able to reliably repro it when fetching too many urls at once. https://github.com/nodejs/node-core-utils/issues/810

KhafraDev avatar May 14 '24 17:05 KhafraDev

I've been working on the same problem for 1 day. Some comments talk about connection, but I don't think that's the case, because I have a very good connection, and it's not a device problem.

Here's the solution:

check your ip dns configuration or clear your dns cache; make sure your router is serving a real dns server, or change your ip dns to target a real dns server.

The observation is that on my online server the code doesn't throw this AggregateError [ETIMEDOUT] error, but on my local machine it always throws this error only when I'm at home on my local network.

Looking further, the main errors thrown are ETIMEDOUT and ENETUNREACH. This usually occurs when the DNS module is unable to resolve the IP address.

node:internal/deps/undici/undici:11754
    Error.captureStackTrace(err, this);
          ^

TypeError: fetch failed
    at node:internal/deps/undici/undici:11754:11
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
  cause: AggregateError [ETIMEDOUT]: 
      at internalConnectMultiple (node:net:1114:18)
      at internalConnectMultiple (node:net:1177:5)
      at Timeout.internalConnectMultipleTimeout (node:net:1687:3)
      at listOnTimeout (node:internal/timers:575:11)
      at process.processTimers (node:internal/timers:514:7) {
    code: 'ETIMEDOUT',
    [errors]: [
      Error: connect ETIMEDOUT <IP:PORT>
          at createConnectionError (node:net:1634:14)
          at Timeout.internalConnectMultipleTimeout (node:net:1685:38)
          at listOnTimeout (node:internal/timers:575:11)
          at process.processTimers (node:internal/timers:514:7) {
        errno: -110,
        code: 'ETIMEDOUT',
        syscall: 'connect',
        address: '3.125.177.232',
        port: 443
      },
      Error: connect ENETUNREACH <IP6:PORT> - Local (:::0)
          at internalConnectMultiple (node:net:1176:40)
          at Timeout.internalConnectMultipleTimeout (node:net:1687:3)
          at listOnTimeout (node:internal/timers:575:11)
          at process.processTimers (node:internal/timers:514:7) {
        errno: -101,
        code: 'ENETUNREACH',
        syscall: 'connect',
        address: '64:ff9b::37d:b1e8',
        port: 443
      }
    ]
  }
}

if the request got succesful once, it seems that the ip address in the log error can be the cached ones stored in your machine before.

My hypothesis is that when working under a network or router with a non-dns server, the node's dns module is no longer able to evaluate the ip address and doesn't use the cached one and sends errors.

I'll check this out, but you have the solution. I changed my dns configuration in /etc/resolv.conf to 8.8.8.8 or 1.1.1.1. and everything's back to normal!

su-angel avatar Jul 25 '24 01:07 su-angel

I have encountered the same error when experimental HTTP/2 support is enabled with massive parallel requests (more than 32 at once, toward about 20 different domains). With HTTP/2 support disabled the error is gone.

SukkaW avatar Oct 10 '24 09:10 SukkaW

Maybe will also solved by #3707 by @metcoder95

Uzlopak avatar Oct 10 '24 09:10 Uzlopak

@guotie @Uzlopak I have run into the same issue - only to recall that I have already resolved it, but did not remember the code.

https://github.com/nodejs/undici/issues/2990

On some networks (like e.g. mine today on a LTE tethered connection in a country far away from my database provider) the default network-family-autoselection-attempt-timeout 250 ms to resolve domain IP might not be enough and results in ETIMEDOUT.

Doubling that time solves all issues for me: export NODE_OPTIONS="--network-family-autoselection-attempt-timeout=500"

gregonarash avatar Jan 01 '25 09:01 gregonarash

Is still reproducible or it can be closed after #3707 landed?

metcoder95 avatar Jan 14 '25 07:01 metcoder95

@metcoder95 yes it is:

Image

Tested on "undici": "^7.2.1"

import { request } from "undici";

const { statusCode, headers, trailers, body } = await request("https://airtable.com");

console.log("response received", statusCode);
console.log("headers", headers);

for await (const data of body) {
  console.log("data", data);
}

console.log("trailers", trailers);

To be able to reproduce you would need some combination of:

  • have a really high latency network
  • be physically far away from US?/ DNS server?
  • be trying to fetch URL that resolves to both IPv4 and IPv6

OR ... cut down the default network family selection time to 25 ms

export NODE_OPTIONS='--network-family-autoselection-attempt-timeout=25'

I believe this is purely related to Node's default being 250ms which is too short on some networks - there was some debate about it being to short or not but it was closed without changes. https://github.com/nodejs/node/issues/54359

gregonarash avatar Jan 14 '25 09:01 gregonarash

I'd rather say that is not really an undici nor node.js problem as it can be sorted out by extending the timeout for the DNS resolution; might be worth it to document tho

metcoder95 avatar Jan 14 '25 22:01 metcoder95

@metcoder95 true. It could be debated if this is the right default in node.js - under the same conditions curl works, fetch does not , but not really a bug.

Either way I expect issues will continue to be opened in upstream repositories like undici , nextjs for fetch fail - AggregateError [ETIMEDOUT]. I replied to couple threads like that to boost visibility. I have also added pull request to add comment about this in the docs, but it seemed to have been removed recently.

gregonarash avatar Jan 15 '25 06:01 gregonarash

Do you have the PR at hand? Why was removed?

If seeking to extend the timeout, I'd suggest opening an issue in Node.js with the reference to the other issues you've opened and the feedback you've got to see where it lands.

For this issue, having the comment added into the documentation about high-latency networks should be ok.

metcoder95 avatar Jan 15 '25 08:01 metcoder95

@metcoder95
I thought it was merged, but not sure what happened: https://github.com/nodejs/undici/pull/3738

gregonarash avatar Jan 15 '25 09:01 gregonarash

can you just address the recommendation there?

metcoder95 avatar Jan 15 '25 09:01 metcoder95

It is addressed, the PR was also approved by you. Not sure how your merging process works.

gregonarash avatar Jan 15 '25 09:01 gregonarash

In my case, the issue was that fetch could not resolve the IPv6 address, even though curl and wget always worked. Disabling IPv6 resolved the issue for me: sudo sysctl -w net.ipv6.conf.all.disable_ipv6=1

nodated avatar Feb 25 '25 01:02 nodated

@nodated

The issue for most of these errors is Node default --enable-network-family-autoselection at the detault 250ms being to short and causing a timeout. Network family selection resolves based on the network condition if request should go to IPv4 or IPv6 address.

Disabling IPv6 indeed will not cause above to kick in.

gregonarash avatar Feb 25 '25 08:02 gregonarash

export NODE_OPTIONS="--network-family-autoselection-attempt-timeout=500" worked for me. I have no idea what that option means, but I assume it's related to how far away the DNS lookup is

RnbWd avatar Mar 31 '25 09:03 RnbWd