`IN` predicate returns wrong results when used with `GROUP BY ... HAVING` subquery
The example data looks like this:
create table t(i int, t varchar(2));
insert into t values (1, 'a'), (1, 'a2'),
(2, 'b'),
(3, 'c'), (3, 'c2'),
(4, 'd'),
(5, 'e'), (5, 'e2');
The following query is expected to return 2 tuples ((2, 'b') and (4, 'd')) but only returns one of them ((4, 'd')):
select *
from t
where i in (select i
from t
group by i
having count(1) = 1)
;
When the HAVING clause is changed to count(1) != 1 the query returns the same result, even though it should have returned the other 6 tuples (where i in (1, 3, 5)):
select *
from t
where i in (select i
from t
group by i
having count(1) != 1)
I checked that this isn't an aliasing issue, so I don't know what is going on here.
I can change the query to use a CTE or a JOIN with the subquery and then the result returned is correct.
Thanks for the bug report Knut. I was able to reproduce this with dolt version 0.41.4. We'll debug through this one and see what's going on.
Great work on this fix @jennifersp!
@knutwannheden – thanks again for helping us find all these great issues to fix! 🙏 Very appreciated! This fix will go out with the next Dolt release. Let us know if you find anything else odd for us to dig into.
AFAICT the PR was reverted again. Should this issue be reopened again?
Hi @knutwannheden, ah yes I'll re-open it for now. It needs fix for the tests only. I'll merge it back again with the tests fixed tomorrow. Sorry about that.
Hey @knutwannheden, the fix has been merged with fixed tests, it will be included in the next Dolt release. Thank you again for helping us find this bug! :)