the-algorithm
the-algorithm copied to clipboard
Consider using vectorization for floating point calculations
Describe the solution you'd like IEEE Superscalar SIMD architecture / loop parallelism or vectorization in code here can significantly speed up FP calculations, depending on the levels of floating precision needed. I would recommend evaluating how much precision is needed, and consider enabling this compiler optimization if there is room for small inaccuracy, for large speed increases. A paper with more on the topic can be found here : https://ieeexplore.ieee.org/document/234917 ;
Thanks for taking this into consideration @austinhutchen.