Tolerance updated in `RegressionTest` in `test/HD/sod/testme.py`
The default tolerance=0 for Regression test is causing the following error. I guess this is because floating point numbers cannot be exactly compared.
Here is the error message on running ./testme.py -all (in the directory test/HD/sod) as given in the README.md instruction
Traceback (most recent call last):
File "/Users/alankard/coding/idefix/test/HD/sod/./testme.py", line 45, in <module>
testMe(test)
File "/Users/alankard/coding/idefix/test/HD/sod/./testme.py", line 28, in testMe
test.nonRegressionTest(filename=name)
File "/Users/alankard/coding/idefix/pytools/idfx_test.py", line 321, in nonRegressionTest
assert error <= tolerance, bcolors.FAIL+"Error (%e) above tolerance (%e)"%(error,tolerance)+bcolors.ENDC
AssertionError: Error (7.303939e-16) above tolerance (0.000000e+00)
The zero tolerance for this test is by design, and it passes the CI on the tested architecture (admittedly disabling some optimisations for nvcc). Which configuration did you test?
I was trying this on my M3 MacBook Pro. I guess the exact match of floating point operations is possible only for same architecture machines?
This error is most probably because LLVM enable fmad instruction on Apple Mx processors, hence it breaks the precision test. These tests should pass with absolute precision when fmad instructions are disabled (that's how CI/CD are performed). So the first question is: how can one disable fmad optimisations in Apple LLVM? (I can't test because I don't have a mac with an Mx processor).
I'm reluctant at changing the precision of these tests, as these have been life savers in the past.