Niles Salter
Niles Salter
Why not target BMI2 for Haswell? Even if you want to avoid PDEP/PEXT because of AMD, don't you still think the other instructions are nice? Does this mean that the...
Well, according to [Agner Fog's instruction tables](https://www.agner.org/optimize/instruction_tables.pdf), `BZHI` has 1 cycle latency and a reciprocal throughput of 0.25 (i.e. you can start up to 4 per cycle if the ports...
Update: I added `bmi2` to that haswell target like you said. This now compiles. Is it faster? Well, maybe. I can't really tell by just running it on the twitter.json...
> However, it is interesting to check whether enabling it would be beneficial, and I plan to investigate. As your analysis suggests, the important instruction is `BZHI`. My message was...
> > They aren't used much, but they do appear in simdjson. > > It is an empirical question. By how much does enabling BMI2 in LLVM or GCC improves...
> @Validark At least under some versions of GCC, and some hardware, enabling BMI2 appears to lower the performance. It does increase slightly the instruction count. > > See https://github.com/simdjson/simdjson/pull/2243...
The thing I like most about this proposal is the potential for having ArrayList and MultiArrayList even more interchangeable. If you decide to change your data layout, you shouldn't need...
Nice! Yes, we can now express the intended control flow in my project too.
@EugeneZelenko Based on @topperc's analysis, this is not an issue with the RISC-V backend, correct?
It's unfortunate that there is no generic fix, since there seems to be the same issue on these machines too: Sparc: https://godbolt.org/z/rP5P4TxTq Power: https://godbolt.org/z/bxxTMx451 AndNot is not that uncommon. Some...