Dávid Bolvanský
Dávid Bolvanský
Reported to Clang devs: https://github.com/llvm/llvm-project/issues/54964
What do you propose? Lower LoopMicroOpBufferSize?
Right, but I am worried that there would be a strong pushback as some (sometimes toy) benchmarks would regress. See discussion (a bit different issue but similar story) https://reviews.llvm.org/D102748
cc @sjoerdmeijer
Similar case with T = short ```c #define N 256 typedef short T; extern T a[N]; extern T b[N]; extern T c[N]; extern _Bool pb[N]; void predicate_by_bool() { for (int...
With -O3 there is no vectorization (icc and gcc vectorizes it) maybe cost model issue too? cc @RKSimon
So missing instcombine canonicalization ?
Maybe instcombine could fold it to just AND?
There are some instructions how to build it here: https://openbenchmarking.org/innhold/99d3a8c1ea3ea71e1edf4aea6bf9af30100f07d5
Ping. LLVM hit this issue with gtest too.