Enable compiling arm/neon with MSVC for windows on arm64
This patch enables building arm/neon with MSVC compiler for windows on arm64 target and contains following changes,
-
Replace the dispatcher mechanism and use an explicit function selection using scalar types. This is required as MSVC intrinsics uses the same underlying type for multiple neon vector types and function selection using target vector type causes the wrong function to be called.
-
Add a function to convert
Initializer_list<batch<T>>to neon vector type as there are no constructors provided for the same operation in MSVC. -
NEON/NEON64 identification using MSVC specific flags
-
Add a
_to intrinsics wrapper functions. MSVC defines some intrinsics using macros and without the prefix the wrapper function names get replaced by the pre-processor.
Can you rebase your PR on the master branch please? This would make the diff easier to read (and would solve the conflicts).
Can you rebase your PR on the master branch please? This would make the diff easier to read (and would solve the conflicts).
Yes sure
The change might be a bit invasive. Let me know if you need help with the review.
Another approach to support MSVC would be to wrap the vector types in a custom class so that all types like float32x4_t and int32x4_t can be distinguished by the dispatcher but it would require too many conversions from the wrapper class to the native class before passing onto intrinsic functions etc which can be done with the user-defined operator but I guess performance cost will be there.
We cannot do such a change while not setting up CI for that platform + arch combination. i'll try to prepare that.