query icon indicating copy to clipboard operation
query copied to clipboard

useQueries have quadratic performance in relation to the number of queries

Open GabbeV opened this issue 1 year ago • 9 comments

Describe the bug

This PR https://github.com/TanStack/query/pull/8295 introduced a significant performance regression for us. A commenter on that PR seem to have had a similar issue https://github.com/TanStack/query/pull/8295#issuecomment-2483173186.

We use the data loader pattern together with useQueries to give individual results from a batch fetch individual cache entries in react-query. Something like this:

const fooLoader = new DataLoader(ids => fetchFoos(ids));
const queries = useQueries({
  queries: ids.map(id => ({
    queryKey: ["foo", id],
    queryFn: () => fooLoader.load(id),
  })),
});

Looking at a performance profile i noticed this will result in one call to #findMatchingObservers in QueriesObserver for every query when the fetch resolve. #findMatchingObservers will then loop over all observers for for all queries internally making the performance quadratic O(n^2).

In our case we were loading way too many ids at once for this to be reasonable so the performance was already bad before the change but became browser freezing bad after the change.

Seems like the problem however should be avoidable. #onUpdate know what observer was updated so should not have to find it. This might however require some reorganization of the code and how observers are stored in QueriesObserver.

The indexOf in #onUpdate is also a source of quadratic performance that should be avoidable however indexOf is a way faster operation than what happens in #findMatchingObservers so it might not be an issue until even more queries are fetched at the same time.

Your minimal, reproducible example

https://codesandbox.io/p/sandbox/boring-https-dxlhnv?file=%2Fsrc%2FApp.js%3A17%2C62

Steps to reproduce

Click on the numeric buttons to try loading different amount of queries and click on the "useQueries" and "multiple useQuery" buttons to switch between using a single useQueries or multiple useQuery and observer the performance difference.

Expected behavior

I expect useQueries to have similar or faster performance than multiple useQuery.

How often does this bug happen?

Every time

Screenshots or Videos

No response

Platform

  • OS: Windows 11 24H2 26100.2894
  • Browser: Firefox 135.0b9, Chrome 132.0.6834.160

Tanstack Query adapter

react-query

TanStack Query version

5.64.2

TypeScript version

No response

Additional context

No response

GabbeV avatar Feb 04 '25 11:02 GabbeV

Sorry for the awkward English I used a translator

I haven't actually tested with large queries, but from looking at the code, I suspect there may be performance issues due to the following reason.

  • In the original implementation, findMatchingObservers was called only once inside getOptimisticResult, and its result matches was reused for tracking.

  • However, in the modified code, we now already call findMatchingObservers to get matches and then call this.#trackResult(result, queries). Inside #trackResult, we perform another call to findMatchingObservers

    return this.#trackResult(result, queries)
    
    #trackResult(result, queries) {
      const matches = this.#findMatchingObservers(queries)
      ...
    }
    
  • This means findMatchingObservers can be run multiple times during a single rendering or notification cycle.

The problem is the findMatchingObservers function is expensive


If you have N queries managed by useQueries, and multiple of them are updated nearly simultaneously, Each update may trigger onUpdate → notify → trackResult → findMatchingObservers. There are also O(N^2) possibilities


I could be wrong, of course, but I think reusing matches would fix it again

function getOptimisticResult(queries, combine) {
  const matches = this.#findMatchingObservers(queries)
  const result = matches.map(match => ...)

  return [
    result,
    (r) => {
      return this.#combineResult(r ?? result, combine)
    },
    () => {
-        // Avoid calling #trackResult which calls findMatchingObservers again
-        // Instead, directly reuse the 'matches'
+        return matches.map((match, index) => {
+          const observerResult = result[index]
+          // ...
+        })
    },
  ]
}

joseph0926 avatar Feb 05 '25 00:02 joseph0926

I don't think there is any issue with getOptimisticResult. It'll only be called once in my scenario. It then doing N operations is fine as it only leads to O(N) performance. Maybe it could be even better with better bookkeeping but that is not what my issue is about.

The issue is with onUpdate → notify → trackResult → findMatchingObservers. onUpdate will be called N times in my scenario and then findMatchingObservers will do N operations leading to O(N*N) performance.

GabbeV avatar Feb 05 '25 09:02 GabbeV

I’ll happily accept a PR that changes how this all works without breaking anything 😂

TkDodo avatar Feb 12 '25 12:02 TkDodo

I’ll happily accept a PR that changes how this all works without breaking anything 😂

I've actually tested with the temporary post-fork test code and found a few issues as shown below. I'll post a "PR" with a fix soon

Image Image

joseph0926 avatar Feb 12 '25 23:02 joseph0926

Looking at the code myself and at your PR @joseph0926 I realized that this will be quite hard to solve entirely. Your PR does remove some quadratic work but onUpdate still contain the mentioned indexOf, trackResult still contain a map over all observers, any user defined combine method will most likely also do N amount of work and if not combineResult will in replaceEqualDeep.

What fundamentally makes multiple useQuery faster than useQueries is reacts auto batching. We notify react once for each updated useQuery but react only rerender once.

I thought that maybe we could do the same for useQueries by notifying useSyncExternalStore on every update and doing combineResult/trackResult lazily when react request the value in getSnapshot. However it seems like react will call getSnapshot every time it is notified defeating this idea.

GabbeV avatar Feb 13 '25 10:02 GabbeV

I think you're right. I guess my solution is not a perfect solution,,

joseph0926 avatar Feb 13 '25 10:02 joseph0926

@GabbeV @joseph0926 I can merge the PR as all current tests pass. Or do you want to continue tinkering on a better solution?

TkDodo avatar Feb 16 '25 12:02 TkDodo

@TkDodo Oh I'd be so grateful if you could merge it first. keep looking for more improvements separately

joseph0926 avatar Feb 16 '25 12:02 joseph0926

A fundamental solution seems difficult to achieve at the moment. Therefore, I’d like to propose some modifications that might help.

The modification improves the time complexity of the #onUpdate function in queriesObserver from O(n) to O(1).

  #onUpdate(observer: QueryObserver, result: QueryObserverResult): void {
    const index = this.#observerMap.get(observer)
    if (index !== undefined) {
      this.#result = replaceAt(this.#result, index, result)
      this.#notify()
    }
  }

zktm9903 avatar Feb 22 '25 10:02 zktm9903

https://github.com/TanStack/query/pull/9467

Hi! Since my previous improvement, I've noticed that several contributors have addressed some of the issues mentioned here. I've submitted an additional PR to fix one last remaining issue: #9467 I'd appreciate it if you could review it when you have time. Thank you!

joseph0926 avatar Jul 20 '25 02:07 joseph0926

as discussed in https://github.com/TanStack/query/pull/9467, this seems already fixed.

TkDodo avatar Aug 04 '25 17:08 TkDodo