calcite icon indicating copy to clipboard operation
calcite copied to clipboard

[CALCITE-4686] SubQueryRemoveRule.matchJoin should correctly rewrite all sub-queries

Open jamesstarr opened this issue 4 years ago • 10 comments

Reworking SubQueryRemoveRule.matchJoin to:

  • Validate join type before rewriting
  • Correctly shift filter and then rewrite join condition on right side of the join
  • Rewrite all subqueries in ON conditions

Testing and refactoring:

  • Moved Shifter from RelBuilder to RelOptUtil, so the logic can be reused by SubQueryRemoveRule.matchJoin.
  • Enabled RelOptRulesTest.testExpandJoinIn and RelOptRulesTest.testExpandJoinInComposite.
  • Adding RelOptRulesTest. testJoinOnMultipleCorrelatedQueries

jamesstarr avatar Mar 31 '21 22:03 jamesstarr

Hey @jamesstarr , thanks for working on this. Going quickly over the changeset I get the impression that the PR tries to achieve multiple things. Can you highlight/outline the goals of this PR either here or under CALCITE-1045?

zabetak avatar May 25 '21 15:05 zabetak

@Zabetak, This PR is focused on handling left joins with correlated queries in ON clauses.

Joins with correlated queries are rewritten too as join with the right side contain filter node and correlate node to handle the join's filter in SubQueryRemovelRule. The filter is checked in a loop for correlation ID to handle multiple subqueries. Right and full outer joins are not rewritten because they would logically be incorrect. Shifter, a util class in RelBuilder, was moved to RelOptUtil so the existing logic could be used in SubQueryRemoveRule. RelBuilder also now only considers a join correlated if an ID is provided and actually referenced is found on the right side to the ID.

jamesstarr avatar May 25 '21 19:05 jamesstarr

@zabetak, do you have any further feed back?

jamesstarr avatar Jun 21 '21 20:06 jamesstarr

@zabetak I pulled in master, create separate ticket and cleaned up the code, and rewrote the pull request comment. Let me know if any further changes are needed.

jamesstarr avatar Jul 08 '21 00:07 jamesstarr

Hey @jamesstarr , thanks for pushing this forward. I think it is in good shape right now. I will try to review and merge soon but not sure if I will make it during the next week.

zabetak avatar Jul 10 '21 05:07 zabetak

@zabetak, Could you please merge this.

jamesstarr avatar Aug 20 '21 21:08 jamesstarr

@Zabetak, I really do not want this going to the graveyard of calcite pull requests.

jamesstarr avatar Aug 24 '21 18:08 jamesstarr

Hi @jamesstarr , I am very busy lately but as soon as I get the chance I will make a final pass and merge it.

zabetak avatar Aug 25 '21 06:08 zabetak

What is the status on this PR? Was it ready to be merged? cc @zabetak I've been trying to follow the trail of relevant JIRA tickets

olivrlee avatar Jul 11 '23 18:07 olivrlee

What is the status on this PR? Was it ready to be merged? cc @zabetak I've been trying to follow the trail of relevant JIRA tickets

Never mind, I had removed the conflicting test some time ago.

There is bug in one of the test due to nested queries not being correlated correctly. I have a fix for the nested queries here: https://github.com/apache/calcite/pull/3042

jamesstarr avatar Jul 11 '23 23:07 jamesstarr