xsimd icon indicating copy to clipboard operation
xsimd copied to clipboard

C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE))

Results 141 xsimd issues
Sort by recently updated
recently updated
newest added

It would be very useful to be able to use directly `batch_bool` mask to increment/decrement integer `batch`, using the integer overfull trick. It allows to update a table of indices...

Feature Request

I could not find any mention of doing within vector reductions, or equally transposing blocks of vectors to make such reductions vectorizable. If I am summing a long list of...

Do you have any guidelines how one would go and implement saturated addition and subtraction?

As reported in #289, neon instruction set is not detected when building on armv7hl. A workaround is `#define XSIMD_FORCE_ARM_INSTR_SET = 70000000` before including any header of xsimd.

I'm trying to use XTENSOR_USE_XSIMD in my project, which otherwise compiles and runs fine on a Raspberry Pi 3B+ with up to date Raspbian Stretch Using - xtensor master -...

For the types specified in: xsimd_avx_double.hpp, xsimd_sse_float.hpp and xsimd_sse_int32.hpp The SSE method `store_aligned_int32(uint8_t* dst)` (and similar) stores using the function `_mm_storel_epi64`. As this is a batch of 4 values, and...

`pow` of `complex` has some accuracy issues with AVX512.

~I think the tile says it all :)~ When running "make/ninja xbenchmark" on my Haswell-based machine, a pair of "neon" rows is present in the table of timings, even though...

I tried to integrate xsimd with xcode project but I get the following 2 errors in `xsimd_scalar.hpp` namespace detail { template inline C sign_complex_scalar_impl(const C& v) { using value_type =...

In order to call store_aligned / unaligned in a generic fashion we should make the second parameter also aware of the batch size not being the max batch size (e.g....