Bill Budge comments

Results 10 comments of


                                            Bill Budge

Include vectorized bit count instructions

ARM v7/v8 have: Vector Count Leading Sign Bits VCLS Vector Count Leading Zeros VCLZ Vector Count Set Bits VCNT

Consider adding Horizontal Add

I'm in favor of this. We've implemented it in the V8 prototype.

Consider adding Horizontal Add

These are pairwise additions, so for a 4 lane vector type, two source operands would form a single destination vector like this: [ src0[0] + src0[1], src0[2] + src0[3], src1[0]...

Consider adding Horizontal Add

These can be composed to do the full reductions. The advantage of keeping them primitive (pairwise) is that a compiler will have more opportunity to schedule the instructions. If we...

Consider adding Horizontal Add

Starting with a vector: [x0, x1, x2, x3] pairwise reduction with itself: [x0 + x1, x2 + x3, x0 + x1, x2 + x3] another pairwise reduction with itself: [x0...

Consider adding Horizontal Add

OK, I see what you're saying. You're correct that floating point operations are not associative. The general intent of SIMD is to give performance improvements for vector operations. If you...

Value semantics unit tests

The V8 value-type tests for float32x4 link posted above changed: https://codereview.chromium.org/1219943002/diff/230001/test/mjsunit/harmony/simd.js

Spec defines MIN and MAX operations only for floating point types. Polyfill implements for all types.

Intel at least appears to have these: https://software.intel.com/en-us/node/583113

Implementation-dependent reciprocal [sqrt] approximation instructions

A concern I have is that f64x2 is not supported well on platforms other than Intel. Implementations would be forced to lower these opcodes to the equivalent scalar ops, probably...

Implementation-dependent reciprocal [sqrt] approximation instructions

Hi JF. Yes, f64x2 in general. We think it will be hard enough to get clear performance wins with f32x4, so somewhat skeptical about f64x2, especially on platforms like ARM.