cranelift: Remove booleans
Remove the boolean types from cranelift, and the associated instructions breduce, bextend, bconst, and bint. Standardize on using 1/0 for the return value from instructions that produce scalar boolean results, and -1/0 for boolean vector elements.
Fixes #3205
I'm really happy that we're finally doing this! Thank you for taking this on.
Some of our lowerings depend on the fact that bools can only be represented as all ones or all zeroes, and so now that we are removing that assumption we might want to revisit them.
The AArch64 implementation for bmask is wrong if we want to accept any integer other than -1 and 0. I would expect to get -1 if I do a bmask on a 2, but that's not what's going to happen. (I haven't tested this, but that's my reading of the implementation).
bint is also assuming that inputs are all ones or all zeros and is implemented as and_imm on AArch64. (i.e. 2 will be false, but should be true)
Edit: AArch64 has instructions (cset and csetm) that represent these two operations, so we may want to partially revert b8f6d53a6bdf861e66093a046a081c886546cdf1 to get them back
Also, just to note it here: if at all possible, I'd prefer for us to wait to merge this until after my mid-end work in #4953 is merged, if only because they're both large and likely-to-conflict PRs and mine has been through ~three painful rebases already :-)
@uweigand, could you take a closer look at my changes to the s390x backend? In particular the changes to the lowering of bmask, as they increased the instruction count for the translation by quite a bit.
@uweigand, could you take a closer look at my changes to the s390x backend? In particular the changes to the lowering of
bmask, as they increased the instruction count for the translation by quite a bit.
Well, in part this is due to changed semantics: the old bmask would take a boolean type as input, which (except for $B1) already used only the values 0 and -1. So the semantics of bmask was simply a no-op or a sign-extend. The new bmask I understand now needs to take any arbitrary integer input, so it actually has to perform a comparison against zero.
That said, I think the new implementation could still be improved, in particular the $I128 cases. However, instead of re-implementing the wheel here, why don't you simply use the existing icmp_val helper to emit the actual comparison?
Maybe something alone the lines of
(rule (lower (has_type oty (bmask x @ (value_type ity))))
(lower_bool_to_mask oty (icmp_val $true (IntCC.NotEqual) x (imm ity 0)))
with a new helper lower_bool_to_mask analogous to the old lower_bool ...
Following up on myself, it would be even simpler and better to use the existing value_nonzero helper, which will fold in previous comparisons if possible:
(rule (lower (has_type ty (bmask x)))
(lower_bool_to_mask ty (value_nonzero x)))
Following up on myself, it would be even simpler and better to use the existing
value_nonzerohelper, which will fold in previous comparisons if possible:(rule (lower (has_type ty (bmask x))) (lower_bool_to_mask ty (value_nonzero x)))
This is much nicer! I'll rework the changes to add lower_bool_to_mask instead.
This is much nicer! I'll rework the changes to add
lower_bool_to_maskinstead.
Thanks! The s390x parts LGTM now.