beam icon indicating copy to clipboard operation
beam copied to clipboard

Add runSelectReturningFirst (RFC)

Open thomasjm opened this issue 4 years ago • 5 comments

This is a proposal to address #367.

thomasjm avatar Feb 03 '22 08:02 thomasjm

I'm slightly against the name - what does First really mean when there's no order_ ? Is there some consistent default order or can we get different elements at different times? Does this change across backends?

My issue with runSelectReturningOne is that the naming doesn't point to the meaning of the query's result set and thus hides the whole "at least one" vs "exactly one" constraint on a success, while the Maybe return type makes you think of "at most one". So it's super confusing and you often forget to consider whether limit_ 1 should be used or not and get unexpected failures. I've seen this happen like a dozen times across projects and people and think that issue will exist as long as runSelectReturningOne exists, which is why I'd prefer renaming it (via deprecation) to runSelectReturningUnique or some other in-your-face name that'd point to the constraint you're buying into, with limit_ 1 as an obvious opt-out.

This seems like it'd lead to confusion between whether to pick runSelectReturningFirst and runSelectReturningOne $ limit_ 1 - do they have different performance?

alexfmpe avatar Feb 09 '22 14:02 alexfmpe

I'm slightly against the name - what does First really mean when there's no order_ ? Is there some consistent default order or can we get different elements at different times? Does this change across backends?

If the name causes the user to ask themselves questions like this, then I think it succeeds in its goal :). I'd love for runSelectReturningOne to be removed or renamed, but that seems like a major breaking change. For the time being, I think adding runSelectReturningFirst is good because it draws attention to the distinction between the two. So hopefully people won't see the Maybe return type and blindly select runSelectReturningOne because it looks like what they want, but will instead read the docstrings on both functions and choose the right one.

thomasjm avatar Feb 10 '22 00:02 thomasjm

I'm slightly against the name - what does First really mean when there's no order_ ? Is there some consistent default order or can we get different elements at different times? Does this change across backends?

To me it seems pretty clear that it just returns the first result, and ordering is up to you. It's a totally valid use case to just want any individual result and not care about the ordering.

The one possible confusion that makes sense to me is about whether this automatically adds a LIMIT 1. @thomasjm can you add a note in the haddock saying something like "This does not automatically apply a limit to the query, you likely want to do that if you expect many results."? However, if anything, I think users are more likely to have that confusion with runSelectReturningOne.

why I'd prefer renaming it (via deprecation) to runSelectReturningUnique or some other in-your-face name that'd point to the constraint you're buying into, with limit_ 1 as an obvious opt-out.

I think this is a good idea, but we can keep it separate from this PR since it doesn't actually interact. If we do that renaming, runSelectReturningFirst still seems to make sense.

I'm planning to do a new major release cycle soon, so we can include it in that. What I'll likely do is rename the class method but add back runSelectReturningOne as a synonym function and mark it deprecated. This way user code should continue to work (with a warning) and only backends would need to update their instances.

kmicklas avatar Feb 17 '22 13:02 kmicklas

It's a totally valid use case to just want any individual result and not care about the ordering.

I agree with the use case, it's just the "first" bit that I have doubts about. If it's any individual result then I think runSelectReturningAny or runSelectReturningSome would work better.

alexfmpe avatar Feb 17 '22 14:02 alexfmpe

@thomasjm can you add a note in the haddock saying something like "This does not automatically apply a limit to the query, you likely want to do that if you expect many results."?

I was just looking at specializing runSelectReturningFirst for Postgres and wondered -- is there some reason that runSelectReturningOne doesn't automatically add a LIMIT? It seems it could do its job more efficiently with a LIMIT 2 applied.

Same reasoning applies to the new runSelectReturningFirst, maybe it would be correct and more efficient to just add LIMIT 1?

thomasjm avatar Feb 18 '22 02:02 thomasjm