xsimd icon indicating copy to clipboard operation
xsimd copied to clipboard

Saturated add and subtract seems to be missing.

Open schrpe opened this issue 6 years ago • 3 comments

Do you have any guidelines how one would go and implement saturated addition and subtraction?

schrpe avatar Oct 10 '19 15:10 schrpe

+1 It could be very usefull for image processing library. Same comment than Peter, if you have guidelines? Regards TR

ThomasRetornaz avatar Oct 19 '19 02:10 ThomasRetornaz

probably making a free function like xsimd::sadd(a, b) would be the way to go.

Then you could implement it for the different architectures in the kernels as a xsimd::kernel::sadd(..).

And in the last step you can use the function which is implemented in the kernel inside the xsimd::sadd function.

For example, here is the kernel implementation for AVX int32 for add:

https://github.com/xtensor-stack/xsimd/blob/38eed51285edad0dd75b552c99cdd8178565e24d/include/xsimd/types/xsimd_avx_int32.hpp#L200-L207

Now, you could just copy-paste that part and call it sadd, for a start. There is some template magic to select the appropriate kernel. Your function might then look like:

template <class T>
auto sadd(xsimd::simd_base<T>& lhs, xsimd::simd_base<T>& rhs) {
        using value_type = typename simd_batch_traits<X>::value_type;
        using kernel = detail::batch_kernel<value_type, simd_batch_traits<X>::size>;
        return kernel::sadd(lhs(), rhs());
}

would be great if you give it a shot! We're happy to assist further :)

wolfv avatar Oct 19 '19 07:10 wolfv

i'll give it a try

ThomasRetornaz avatar Nov 02 '19 10:11 ThomasRetornaz

sadd and ssub are now implemented for all supported targets,

serge-sans-paille avatar Oct 16 '22 06:10 serge-sans-paille