OpenCL-Docs Relax the required accuracy of tan(half)?

Since the working group has recently reduced the required accuracy of 1 / half, I'd like to request relaxing the required accuracy of tan(half).

My implementation fails bruteforce on tan(half) like this:

ERROR: tan: 2.248892 ulp error at 0x1.2e8p+14 (half 0x74ba) Expected: 0x1.edcp+3 (half 0x4bb7) Actual: 0x1.ee4p+3 (half 0x4bb9)

This 2.25ULP error is the largest of my implementation for all half inputs.

Consider these steps (0x1.2e8p+14 == 19360)

// Reduce argument 19360 - 12325 * pi / 2 = -0.064727747100832...

// Compute result tan(-0.064727747100832...) = -0.064818295060042395... -1 / -0.064818295060042395... = 15.4277430326373342225222...

// Exact result tan(19360) = 15.4277430326373342225222...

// Rounding to half round(-0.064727747100832...) = -0x1.09p-4 tan(-0x1.09p-4) = -0.0647876855794585216773... which rounds to -0x1.094p-4 -1 / -0x1.094p-4 = 15.4420358152686145146... which rounds to 0x1.ee4p+3 (see actual)

So, even with a correctly rounded reduced argument, correctly rounded tan(), and correctly rounded reciprocal, we end up with a relative error of 2.25ULP.

Since we would like to carry out as much of the tan(half) implementation in half precision as possible, we're requesting lifting the required accuracy of tan(half) to 2.25 ULP.

Jan 27 '25 22:01 b-sumner

Independent verification of the algorithm: https://www.wolframalpha.com/input?i=tan%2819360%29+%2B+1%2Ftan%2819360+-+12325*Pi%2F2%29

Jan 27 '25 22:01 b-sumner

We are still discussing this internally and should have some feedback in a couple weeks.

Feb 03 '25 19:02 lakshmih

Qualcomm is okay with the accuracy for tan(half) being relaxed to 2.25 ULP

Feb 11 '25 17:02 lakshmih

Are any opposed to making this change?

Feb 20 '25 22:02 b-sumner

I added the agenda label hoping to get final approval of this change.

May 19 '25 16:05 b-sumner

Linking with #1373, which is at least slightly related since it has to do with the required accuracy for the fp16 sqrt.

May 19 '25 21:05 bashbaug

I created https://github.com/KhronosGroup/OpenCL-Docs/pull/1387, which makes this relaxation, which might help to get discussion moving.

Note, we would still need a CTS change as well.

Jun 05 '25 21:06 bashbaug