matrixone icon indicating copy to clipboard operation
matrixone copied to clipboard

fix shuffle join for recursive cte

Open iamlinjunhong opened this issue 7 months ago • 2 comments

User description

What type of PR is this?

  • [ ] API-change
  • [ ] BUG
  • [ ] Improvement
  • [ ] Documentation
  • [ ] Feature
  • [ ] Test and CI
  • [ ] Code Refactoring

Which issue(s) this PR fixes:

issue #https://github.com/matrixorigin/MO-Cloud/issues/5764

What this PR does / why we need it:

fix shuffle join for recursive cte


PR Type

Bug fix


Description

  • Fix shuffle join logic for recursive CTE queries

  • Add proper batch handling for last batch condition

  • Include comprehensive test case for recursive CTE with large dataset


Changes walkthrough 📝

Relevant files
Bug fix
shuffle.go
Fix batch handling in shuffle operations                                 

pkg/sql/colexec/shuffle/shuffle.go

  • Fix batch handling logic by adding bat.Last() condition check
  • Resolve merge conflict by removing duplicate import formatting
  • Ensure proper flow control for last batch in shuffle operations
  • +4/-0     
    Tests
    recursive_cte.result
    Add test results for recursive CTE fix                                     

    test/distributed/cases/recursive_cte/recursive_cte.result

  • Add test results for new recursive CTE test case
  • Include expected output showing count of 6 for large dataset query
  • +8/-1     
    recursive_cte.sql
    Add comprehensive recursive CTE test case                               

    test/distributed/cases/recursive_cte/recursive_cte.sql

  • Add comprehensive test case with large dataset (1M records)
  • Test recursive CTE with JOIN operations and filtering conditions
  • Include table creation, data insertion, and cleanup operations
  • +6/-1     

    Need help?
  • Type /help how to ... in the comments thread for any questions about Qodo Merge usage.
  • Check out the documentation for more information.
  • iamlinjunhong avatar Jun 25 '25 10:06 iamlinjunhong

    You are nearing your monthly Qodo Merge usage quota. For more information, please visit here.

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
    🧪 PR contains tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Merge Conflict

    The code contains unresolved merge conflict markers that need to be cleaned up before merging. The import section has Git conflict markers that will cause compilation errors.

    <<<<<<< HEAD
    	"fmt"
    
    =======
    >>>>>>> 74ada5848 (fix shuffle join for recursive cte)
    
    Logic Error

    The control flow logic appears incorrect. When bat.Last() is true, the function returns immediately without processing the batch, but the return statement seems to be in the wrong conditional block.

    } else if bat.Last() {
    	return result, nil
    

    qodo-code-review[bot] avatar Jun 25 '25 10:06 qodo-code-review[bot]

    You are nearing your monthly Qodo Merge usage quota. For more information, please visit here.

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Fix premature return on last batch

    The control flow is incorrect - when bat.Last() is true, the function returns
    immediately without processing the batch. This causes the last batch to be
    skipped entirely, which can lead to data loss in shuffle operations.

    pkg/sql/colexec/shuffle/shuffle.go [110-112]

     } else if bat.Last() {
    -	return result, nil
    -} else if !bat.IsEmpty() {
    +	shuffle.ctr.ending = true
    +	shuffle.ctr.lastForShufflePool = shuffle.ctr.shufflePool.Ending()
    +}
    +if !bat.IsEmpty() {
    
    • [ ] Apply / Chat <!-- /improve --apply_suggestion=0 -->
    Suggestion importance[1-10]: 10

    __

    Why: The suggestion correctly identifies a critical bug introduced in the PR. The new logic for bat.Last() causes the function to return prematurely, skipping the processing of the final batch, which would lead to data loss or incorrect results. The proposed fix is correct.

    High
    Resolve merge conflict markers
    Suggestion Impact:The commit directly implemented the suggestion by removing all merge conflict markers (>>>>>> 74ada5848) and removing the fmt import entirely

    code diff:

    -<<<<<<< HEAD
    -	"fmt"
    -
    -=======
    ->>>>>>> 74ada5848 (fix shuffle join for recursive cte)
    

    There's an unresolved merge conflict in the import section. The conflict markers
    need to be resolved by either keeping or removing the fmt import based on
    whether it's actually used in the code.

    pkg/sql/colexec/shuffle/shuffle.go [19-23]

    -<<<<<<< HEAD
     	"fmt"
     
    -=======
    ->>>>>>> 74ada5848 (fix shuffle join for recursive cte)
    -
    

    [Suggestion processed]

    Suggestion importance[1-10]: 10

    __

    Why: The suggestion correctly identifies unresolved merge conflict markers (<<<<<<<, =======, >>>>>>>) in the import block. This is a critical error that will cause the build to fail, and it must be resolved before the PR can be merged.

    High
    • [ ] Update <!-- /improve_multi --more_suggestions=true -->

    qodo-code-review[bot] avatar Jun 25 '25 10:06 qodo-code-review[bot]