ndarray-linalg
ndarray-linalg copied to clipboard
Autovectorization on x86
I have a, b, c of type Array1<u64>. a is mutable reference, whereas b & c are just reference. I want to set each element in a as product of elements in b & c at corresponding indices. Implementation is relatively straightforward and can be vectorized by compiler. However, I noticed that compiler only vectorizes when I iter using a, b, and c as slices but not as.view().
This is link to both implementations. Notice that nd_mul_u64_view is not vectorized and nd_mul_u64_slice is.