sleef icon indicating copy to clipboard operation
sleef copied to clipboard

Vectorization of tanh

Open vedanuj opened this issue 7 years ago • 2 comments

Recently I submitted a PR to the PyTorch Repo for a vectorized tanh implementation for single precision. The implementation is a vectorized version of cephes math library's single precision tanhf function. In PyTorch setting the implementation seemed faster than Sleef_tanhf8_u10 (I have posted some benchmark numbers in the PR here).

Are there any Sleef benchmarks that I can run to compare the implementation? In case it is faster are you open to a PR? Thanks!

vedanuj avatar May 12 '18 18:05 vedanuj

Hello @vedanuj, Thank you for considering contribution. I'm open to a PR if your implementation is good enough. However, I cannot confirm if your implementation is good enough to adopt. Please consider checking the following points.

  • Is it an alternative to Sleef_tanhf8_u10? If so, please make sure that it's error is less than 1 ULP. It seems that you checked the correctness of your subroutine using a utility included in PyTorch, and it only took less than 1 second to check? That's not enough to check if the maximum error is less than the specified number. Please use tester2 included in libm-tester directory. Of course, you can use your own utility to check the maximum error.
  • Don't you have a double-precision implementation?
  • You also need to write the code using helper functions, like other functions in SLEEF.

I am now trying to implement 3.5-ULP versions of hyperbolic functions.

shibatch avatar May 13 '18 03:05 shibatch