llvm-project correctly rounded mathematical functions?

the current C working draft [1, p392] has reserved names for correctly rounded functions (cr_exp, cr_log, cr_sin, ...).

We propose to provide such correctly rounded implementations for the three IEEE formats (binary32, binary64, binary128) and the "extended double" format (long double on x86_64).

These implementations will be correctly rounded for all rounding modes, for example one could do the following to emulate interval arithmetic:

fesetround (FE_DOWNWARD); y_lo = cr_exp (x_lo); fesetround (FE_UPWARD); y_hi = cr_exp (x_hi);

Users who want a fast implementation will call the exp/log/sin/... functions, users who want a correctly rounded function and thus reproducible results (whatever the hardware, compiler or operating system) will use the cr_exp/cr_log/cr_sin/... functions. Our goal is nevertheless to get the best performance possible.

Our objective is to provide open-source implementations that can be integrated in the major mathematical libraries (GNU libc, Intel Math Library, AMD Libm, Redhat Newlib, OpenLibm, Musl, llvm-libc, CUDA, ROCm).

Are developers of ROCm interested by such functions? If so, we could discuss what would be the requirements for integration in ROCm in terms of license, table size, allowed operations.

We have started to work on two functions (cbrt and acos), for which we provide presumably correctly rounded implementations (up to the knowledge of hard-to-round cases) [2].

Christoph Lauter Jean-Michel Muller Alexei Sibidanov Paul Zimmermann

[1] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2596.pdf [2] https://homepages.loria.fr/PZimmermann/CORE-MATH/

Jan 03 '22 16:01 zimmermann6

It's a real pleasure to hear from such a well-known and remarkable group! Please expect our response soon.

Jan 03 '22 18:01 b-sumner

My current understanding is that we would like to see this work succeed for AMD CPUs and GPUs. On the GPU side, how would you like to communicate?

Jan 07 '22 16:01 b-sumner

I'm not sure to understand what you mean by "communicate" in this context. I'm not a GPU expert. My colleague Vincenzo Innocenti did a few experiments on GPU with ROCm (and reported a few issues to you).

Jan 08 '22 07:01 zimmermann6

I think all the tools you need to develop a correctly rounded math library for HIP or OpenMP offload language applications are available. But I expect there may be information about GPU programming, HIP language details, runtime behavior, etc., that you may need that is not in the documentation. We can discuss that here if desired, or use another channel.

Jan 08 '22 17:01 b-sumner

sorry if our initial message was not clear. We will provide generic implementations in the C language, but we will not deal with the integration into the different math libraries. However if we get feedback for the routines already available at https://homepages.loria.fr/PZimmermann/CORE-MATH/, we can arrange so that the integration will be as easy as possible, as long as it does not make integration into the other libraries harder.

Jan 09 '22 07:01 zimmermann6

Sorry for my confusion. I understand better now what is being proposed.

Jan 09 '22 17:01 b-sumner

is there any chance you can integrate the code from https://gitlab.inria.fr/core-math/core-math/-/tree/master/src/binary32 (single precision), for example the powf code (https://gitlab.inria.fr/core-math/core-math/-/blob/master/src/binary32/pow/powf.c)?

May 18 '22 09:05 zimmermann6

@zimmermann6 I'm afraid that is not likely. There is quite a willingness here to trade accuracy for performance. A correctly rounded library would be good as an alternative though, but I currently don't have time to port the code to the GPU.

May 18 '22 14:05 b-sumner

ok, more details are available here : https://hal.inria.fr/hal-03721525

Sep 02 '22 10:09 zimmermann6

Thanks!

Sep 02 '22 15:09 b-sumner

please find a new version of our analysis of the "Accuracy of Mathematical Functions" updated to latest available (for ROCm 5.4.0)

https://members.loria.fr/PZimmermann/papers/accuracy.pdf

Feb 14 '23 09:02 VinInn

we have updated our comparison:

https://members.loria.fr/PZimmermann/papers/accuracy.pdf

This is a new version with updated versions of the different libraries and:

the Microsoft math library is now included (it was clearly missing)
we also added the FreeBSD math library
and we added some new C23 functions (acospi, cospi, ...) which are available in the Intel and FreeBSD math libraries

for ROCm is updated to 5.6.0 with only minor differences w/r/t to the previous version.

Sep 21 '23 08:09 VinInn

we have updated our comparison:

https://members.loria.fr/PZimmermann/papers/accuracy.pdf

This is a new version with updated versions of the different libraries, new corner cases found, and the ARM Performance Library is now included.

no updates for ROCm in this version

Feb 15 '24 14:02 VinInn

Thank you for the notice!

Feb 15 '24 16:02 b-sumner

Hi @zimmermann6, I will be closing this issue for now since there is no actionable items in sight. I recommend moving this discussion to ROCm discussion (https://github.com/ROCm/ROCm/discussions), but please feel free to continue provide follow ups below. Thanks!

Jan 07 '25 19:01 tcgu-amd

thanks, this is now https://github.com/ROCm/ROCm/discussions/4242

Jan 08 '25 07:01 zimmermann6