Bug Description

I've noticed a bug where if the body of a fetch response is large (let us say above 50mb), grabbing the body contents takes a long time to load. I compared it to undici's normal request function and node-fetch's fetch response. In comparison, node-fetch took about 8.5 to 9.4 seconds to grab a 130mb file, undici's request function took 2.3 to 2.6 seconds, and undici's fetch took 95.5 to 104.2 seconds.

Reproducible By

`undici` fetch, 95.5 seconds, 18,605 buffers

await (async () => {
    const { fetch } = require('undici');

    const response = await fetch('https://alpha.notsobot.com/api/undici/test');
    let now = Date.now();
    const buffers = [];
    for await (let x of response.body) {
        buffers.push(x);
    }
    console.log(Date.now() - now);
    now = Date.now();
    JSON.parse(Buffer.concat(buffers));
    console.log(Date.now() - now);
})();

`undici` request, 2.6 seconds, 9794 buffers

await (async () => {
    const { request } = require('undici');

    const response = await request('https://alpha.notsobot.com/api/undici/test');
    let now = Date.now();
    const buffers = [];
    for await (let x of response.body) {
        buffers.push(x);
    }
    console.log(Date.now() - now);
    now = Date.now();
    JSON.parse(Buffer.concat(buffers));
    console.log(Date.now() - now);
})();

`node-fetch` fetch, 9.4 seconds, 28909 buffers

await (async () => {
    const fetch = require('node-fetch');
    const response = await fetch('https://alpha.notsobot.com/api/undici/test');
    let now = Date.now();
    const buffers = [];
    for await (let x of response.body) {
        buffers.push(x);
    }
    console.log(Date.now() - now);
    now = Date.now();
    JSON.parse(Buffer.concat(buffers));
    console.log(Date.now() - now);
})();

(Also, doing .arrayBuffer() yields the same results)

Expected Behavior

Similar timing to undici's request function

Logs & Screenshots

Environment

Ubuntu 22.10, Node v19.8.1 and v18.7.0, undici v5.2.0 and v5.21.0

Additional context

Mar 17 '23 22:03 cakedan

This is unfortunately an issue with using webstreams vs. node streams. Node streams perform much better than webstream currently.

Mar 18 '23 01:03 KhafraDev

If we don't decompress the response, it takes 4.4 seconds. If we do (using zlib.createBrotliDecompress()) it takes ~70 seconds.

Mar 27 '23 21:03 KhafraDev

... and if we re-assign this.body to the result of pipeline, it fixes the issue, but causes issues with invalid gzipped/brotli compressed bodies.

@ronag you know much more about streams than I do, do you have any idea?

Mar 28 '23 19:03 KhafraDev

... and if we re-assign this.body to the result of pipeline, it fixes the issue, but causes issues with invalid gzipped/brotli compressed bodies.

Not sure I follow.

Apr 14 '23 14:04 ronag

So for example, this diff fixes the issue:

diff --git a/lib/fetch/index.js b/lib/fetch/index.js
index 0b2e3394..14e84b29 100644
--- a/lib/fetch/index.js
+++ b/lib/fetch/index.js
@@ -2023,7 +2023,7 @@ async function httpNetworkFetch (
             status,
             statusText,
             headersList: headers[kHeadersList],
-            body: decoders.length
+            body: this.body = decoders.length
               ? pipeline(this.body, ...decoders, () => { })
               : this.body.on('error', () => {})
           })

but then zlib/brotli tests fail (only tested on the node-fetch tests but it's safe to assume others will fail too)

Apr 14 '23 15:04 KhafraDev

I think it just removes the decompression...?

Apr 14 '23 15:04 ronag

yeah it does, but what I don't understand is why it's causing an issue here, but not with node-fetch. Node-fetch uses pipeline & zlib too.

Apr 14 '23 15:04 KhafraDev

Webstreams...

Apr 14 '23 15:04 ronag

I thought so too (made an issue in the performance repo), but considering that removing the decompression fixes the issue...?

Apr 14 '23 15:04 KhafraDev

Can be that web streams have smaller chunks which makes the decompression much less efficient... or something

Apr 14 '23 15:04 ronag

I don't think this is something we can solve in undici.

Apr 14 '23 15:04 ronag

web streams have smaller chunks

No, I don't think so? In the OP node-fetch has 10k more chunks than undici.fetch does. Adjusting the highwatermark/size didn't make much difference if I'm remembering correctly

Apr 14 '23 15:04 KhafraDev

🤷‍♂️

Apr 14 '23 15:04 ronag

@cakedan do you have a repro that runs locally, without the external server? I can't seem to replicate the issue

May 16 '23 23:05 KhafraDev

Undici's Fetch's Body takes a long time to load compared to just request

Bug Description

Reproducible By

undici fetch, 95.5 seconds, 18,605 buffers

undici request, 2.6 seconds, 9794 buffers

node-fetch fetch, 9.4 seconds, 28909 buffers

Expected Behavior

Logs & Screenshots

Environment

Additional context

`undici` fetch, 95.5 seconds, 18,605 buffers

`undici` request, 2.6 seconds, 9794 buffers

`node-fetch` fetch, 9.4 seconds, 28909 buffers