YIMINGXU
YIMINGXU
Hello, We have developed a new merge join; however, it currently only supports single-column joins for integer/long types. We've achieved the best results with range joins (improving performance by n...
> @hn5092 Thanks for your POC, so the main idea is instead of comparing the vector values themselves, you compress the values into `int32_t`s (you call them "hash" but they...
> Hi @hn5092 thank you for starting this discussion. I'm curious about how much faster this gets if you compare to our current merge join implementation. > > If I...
@pedroerp performance diff : env: macbook m1 max query: select event, count(precaseid(Stringtype)) from a join b hash: new mergejoin : select event, count(encoded_case_id(int type)) from a join b hash: new...
the old version merge join code i can't found it... with the diff, the buid side more complex , the merge join more quickly
> That makes sense to me. We had planned to fix this as part of the vectorized merge join effort, but somehow in the pipelines we were testing, key comparisons...
> @hn5092 We are currently refactoring our vector comparison framework, and will add the function that compare 2 vectors in one call. The saving of SIMD comparison (vs non-SIMD tight...