xenium icon indicating copy to clipboard operation
xenium copied to clipboard

Replace find_last_bit_set with some intrinsic functions

Open wjcskqygj2015 opened this issue 5 years ago • 2 comments

In G++, there exists some intrinsic function that can make the function find_last_bit_set more efficiently, i.e. __builtin_clz If under g++/clang, maybe we can replace the function like this. Or even simply using the BSR(bit scan reverse) instructions. Besides, the VSStudio also provides _BitScanReverse, _BitScanReverse64 if needed.

  template <typename T>
  constexpr unsigned find_last_bit_set(T val) {
    // Or we can replace it with bit scan reverse with extra 1 plus.
    if constexpr (sizeof(T) == sizeof(unsigned int)) {
      return sizeof(T) * 8 - __builtin_clz(val);
    } else if constexpr (sizeof(T) == sizeof(unsigned long)){
      return sizeof(T) * 8 - __builtin_clzl(val);
    } else if constexpr (sizeof(T) == sizeof(unsigned long long)){
      return sizeof(T) * 8 - __builtin_clzll(val);
    } else {
      unsigned result = 0;
      for (; val != 0; val >>= 1)
        ++result;
      return result;
    }
  }

wjcskqygj2015 avatar Apr 23 '20 13:04 wjcskqygj2015

Thanks for the tip. I know about the intrinsics for bitscan forward/reverse, I thought I even added a TODO to adapt the function. Unfortunately I don't think the intrinsisc are constexpr, but I will look into this.

mpoeter avatar Apr 24 '20 17:04 mpoeter

It seems it is constexpr under g++ even with O0.

wjcskqygj2015 avatar Apr 30 '20 10:04 wjcskqygj2015