quickwit icon indicating copy to clipboard operation
quickwit copied to clipboard

No `num_docs` optimization and lazy top K.

Open fulmicoton opened this issue 3 years ago • 1 comments

If num_docs is not required, there are many optimization we can run.

For instance, if we sort by docs we can often searhc on only one split and abort search within the split.
If we search by -date too, we can sort splits by order of their max_date, and stop search as soon as we get a guarantee that no docs will enter the top K.

We can hardcode these optimization for the moment, and revisit this if someone has some great formalism to get a proper distributed execution plan abstraction.

fulmicoton avatar Jun 30 '22 14:06 fulmicoton

we've added some optimizations to search on less splits (but generally more than one) when num_docs isn't asked for. we also added said optimization when sorting by date/-date/doc_id

trinity-1686a avatar Feb 12 '24 12:02 trinity-1686a