Notice for cross joins
Draft for now
TODO:
- Add a bit more tests.
- Polish the hint that we give to the user. We should probably create a doc page about cross joins, and link that in the hint.
- Maybe make the notice help with finding where the cross join is in the plan, e.g., by naming the join inputs when possible and/or naming the enclosing Let binding.
- full slt and testdrive rewrite (after we'll have settled on the parameters and wording of the hint)
- We have a relevant outstanding bug in the optimization notices infra, which we might want to fix: https://github.com/MaterializeInc/database-issues/issues/7428. Note that I recently found a local repro by accident, so shouldn't be too hard.
- (I'm not sure why
enable_for_item_parsing trueis true for the old notices. I think we should set it to false.)
Motivation
The Vori incident, and other similar ones before. Also https://github.com/MaterializeInc/database-issues/issues/7677
Tips for reviewer
Checklist
- [ ] This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
- [ ] This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
- [ ] If this PR evolves an existing
$T ⇔ Proto$Tmapping (possibly in a backwards-incompatible way), then it is tagged with aT-protolabel. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
- [ ] If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.
Maybe make the notice help with finding where the cross join is in the plan, e.g., by naming the join inputs when possible and/or naming the enclosing Let binding.
Showing cross joins is on todo list for EXPLAIN, could maybe even be detected for EXPLAIN ANALYZE.
Showing cross joins is on todo list for EXPLAIN
This PR would show it in EXPLAIN (both EXPLAIN OPTIMIZED PLAN and the default EXPLAIN). We get this from the existing optimization notice infrastructure.
Discussion in SQL meeting today:
- This work is parked for now.
- We might want "dynamic notices" to filter out false positives (harmless cross-joins), based on introspection data at dataflow runtime. (Current notices are static, and planning/optimization time.)
Ah right; I meant that the new LIR-based EXPLAIN syntax could detect and report cross joins.
Recording comments from the SQL sync: we should also build a dynamic "expensive cross join" detector so that we can correlate the static notice with bad dynamic behaviors, to determine our false positive rate. (I'm not certain exactly how to distinguish an expensive cross join from a cheap one---I can imagine some memory threshold, either absolute or relative to the whole query.)