Open
rootjalex
opened this issue 3 years ago
•
0 comments
On the ARM backend, we should be targeting the USDOT/SUDOT instructions for mixed-sign dot products, i.e. when compiling conv3x3 with accumulator type Int(32). LLVM exposes an intrinsics for USDOT.