xsimd issues

Overload +/+=/-/-= operators with right term batch_bool

It would be very useful to be able to use directly `batch_bool` mask to increment/decrement integer `batch`, using the integer overfull trick. It allows to update a table of indices...

balancap

Feature Request

Feature request: within vector reductions

6

I could not find any mention of doing within vector reductions, or equally transposing blocks of vectors to make such reductions vectorizable. If I am summing a long list of...

ojwoodford

Saturated add and subtract seems to be missing.

3

Do you have any guidelines how one would go and implement saturated addition and subtraction?

schrpe

Neon not detected on armv7hl

As reported in #289, neon instruction set is not detected when building on armv7hl. A workaround is `#define XSIMD_FORCE_ARM_INSTR_SET = 70000000` before including any header of xsimd.

JohanMabille

compilation errors on Raspberry Pi

8

I'm trying to use XTENSOR_USE_XSIMD in my project, which otherwise compiles and runs fine on a Raspberry Pi 3B+ with up to date Raspbian Stretch Using - xtensor master -...

amahoneyLIT

Wrong size for char* overwrite

For the types specified in: xsimd_avx_double.hpp, xsimd_sse_float.hpp and xsimd_sse_int32.hpp The SSE method `store_aligned_int32(uint8_t* dst)` (and similar) stores using the function `_mm_storel_epi64`. As this is a batch of 4 values, and...

nick-dumas

exp / log / pow for complex<float>

2

`pow` of `complex` has some accuracy issues with AVX512.

JohanMabille

Benchmark prints a "neon" result (presumably SSE) on an x86 machine

3

~I think the tile says it all :)~ When running "make/ninja xbenchmark" on my Haswell-based machine, a pair of "neon" rows is present in the table of timings, even though...

HadrienG2

problem compiling in Xcode ('C' does not refer to a value)

2

I tried to integrate xsimd with xcode project but I get the following 2 errors in `xsimd_scalar.hpp` namespace detail { template inline C sign_complex_scalar_impl(const C& v) { using value_type =...

pzoltowski

store_aligned second parameter should be batch<T, N> not simd_type

5

In order to call store_aligned / unaligned in a generic fashion we should make the second parameter also aware of the batch size not being the max batch size (e.g....

wolfv

xsimd
xsimd copied to clipboard

Metadata

Overload +/+=/-/-= operators with right term batch_bool

Feature request: within vector reductions

Saturated add and subtract seems to be missing.

Neon not detected on armv7hl

compilation errors on Raspberry Pi

Wrong size for char* overwrite

exp / log / pow for complex<float>

Benchmark prints a "neon" result (presumably SSE) on an x86 machine

problem compiling in Xcode ('C' does not refer to a value)

store_aligned second parameter should be batch<T, N> not simd_type

← Metadata

Owner

Metadata

xsimd xsimd copied to clipboard

Metadata

← Metadata

Owner

Metadata

xsimd
xsimd copied to clipboard