featurebase icon indicating copy to clipboard operation
featurebase copied to clipboard

Investigate query optimization

Open alanbernstein opened this issue 8 years ago • 2 comments

Description

The original Roaring paper describes an "Optimized algorithm to compute the union of many roaring bitmaps" (algorithm 4 in https://arxiv.org/pdf/1402.6407.pdf). This is used, for example, to compute a union across 200 bitmaps, in comparison to "naive union" which simply computes the unions sequentially. It's not clear from the benchmark results (table 3 in https://arxiv.org/pdf/1603.06549.pdf) whether this is worth the effort.

It's also not clear whether we have a use case that would benefit from this.

Presumably a similar approach would work for intersections.

Success criteria (What criteria will consider this ticket closeable?)

A decision to either start work on a query optimization component, or not.

alanbernstein avatar Aug 17 '17 01:08 alanbernstein

runToBitmap could possibly see some improvement with Algorithm 3.

tgruben avatar Aug 17 '17 01:08 tgruben

the best thing we could do to improve query perf would probably be to implement the *InPlace methods, and then make intelligent use of them in queries. This reduces memory allocation and usage. That said, query perf is already quite good, and this probably isn't a short term priority. These algos may be worth looking into at some point though.

jaffee avatar Apr 09 '19 17:04 jaffee