query useQueries and large datasets is crashing browsers

Describe the bug

In a large react project we are using useQueries for fetching relatively large amounts of data that is deserialized with batshit.

This gives problems in useQueries when having to request more than 20k entities (each entity is represented by a GUID), this has been up in a discussion here https://github.com/TanStack/query/discussions/6305, specially with requests around 30k and larger is troublesome

Note that the codesandbox example is running with query only, without any other plugins / helper libraries

Your minimal, reproducible example

https://codesandbox.io/p/sandbox/react-query-large-queries-set-c5244l

Steps to reproduce

Change the number of requested ids in App.tsx on line 8 (currently 20k ids)

Expected behavior

Would expect that the browser does not crash.

How often does this bug happen?

Every time

Screenshots or Videos

No response

Platform

Chrome and firefox tested on both windows 11 and ubuntu 22.04. All have issues with the example

Tanstack Query adapter

react-query

TanStack Query version

5.7.0

TypeScript version

5.2.2

Additional context

As noted, there are a discussion about this as well, but chose to rise this as a bug as well

Dec 26 '23 20:12 tbowmo

a lot of time (43%) is spend in timeouts. Setting gcTime and staleTime to Infinity avoids some of those, but we also use timeouts for batching. This will get customizable with:

#6600

so I'd wait for that and then set the batching to requestAnimationFrame or queueMicroTask and see what does better.

Another 17% is commitReconciliationEffects from react - I think this is for render 30k nodes? 20% goes to rendeWithHooks - also from react

so a good amount also goes to rendering - if I remove rendering the 30k divs, it goes from taking 15 seconds to 10 seconds. I'm hoping that the timers will also help a bunch - I'll re-measure once that above mentioned PR is merged.

It's worth noticing that 30k nodes doesn't crash for me, but yeah, I think you've just hit the limit here. useQueries creates one query instance for each, so that makes 30k queries that all have their own timers and their own overhead, even though it's just one observer.

not sure if that's possible for your case but I'm guessing having maybe one query that fires things in parallel with Promise.all will perform a lot better. Yes, you lose the ability to cache things separately but that's just the tradeoff here.

Dec 30 '23 13:12 TkDodo

#6600 is merged so we can now do notifyManager.setScheduler(fn) and test what is fastest (with 30k queries, no rendering, both staleTime and cacheTime set to Infinity):

baseline: ~11.0s scripting
queueMicrotask: ~9.3s
requestAnimationFrame: ~10.1s

A little gain, but not as much as I'd hoped.

Dec 31 '23 12:12 TkDodo

To answer some of your questions, yes we are rendering 30k nodes, but this is after we have fetched all data with useQueries, as we use the combine feature to only return data when all individual queries returns isSuccess, And the rendering time is there regardless of how we fetch data (like if we use query,or if we use our old fetching methods which used a mixture of useState and useEffects etc. to fetch data)

I'm a bit weary of setting timeouts to infinity, as it's a multi-user system, and multiple persons can update some of the 30k+ records, and we would like to be able to show updated / relevant data when we load the page. But it's something that has to be tested in our environments.

Btw. The queries do end up as one single backend request, that returns 30k records. So we have a workaround to just use a single useQuery for the fetching. But as you mention we loose the ability to do caching, and re-using that data elsewhere in the system.

Jan 01 '24 10:01 tbowmo

Apologies for the linked PR - I referenced this issue accidentally

Feb 17 '24 13:02 SebKranz

Btw. The queries do end up as one single backend request, that returns 30k records. So we have a workaround to just use a single useQuery for the fetching. But as you mention we loose the ability to do caching, and re-using that data elsewhere in the system.

you lose the ability to cache separately. You can still do fine-grained subscriptions to that cache with select. I think having 20k or 30k individual queries is just something that creates too many observers. I have seen 8-10k observers work successfully in large apps, but larger than that I think it's just something we won't support, sorry.

Feb 18 '24 08:02 TkDodo