idefix icon indicating copy to clipboard operation
idefix copied to clipboard

Tolerance updated in `RegressionTest` in `test/HD/sod/testme.py`

Open dutta-alankar opened this issue 1 year ago • 3 comments

The default tolerance=0 for Regression test is causing the following error. I guess this is because floating point numbers cannot be exactly compared.

Here is the error message on running ./testme.py -all (in the directory test/HD/sod) as given in the README.md instruction

Traceback (most recent call last):
  File "/Users/alankard/coding/idefix/test/HD/sod/./testme.py", line 45, in <module>
    testMe(test)
  File "/Users/alankard/coding/idefix/test/HD/sod/./testme.py", line 28, in testMe
    test.nonRegressionTest(filename=name)
  File "/Users/alankard/coding/idefix/pytools/idfx_test.py", line 321, in nonRegressionTest
    assert error <= tolerance, bcolors.FAIL+"Error (%e) above tolerance (%e)"%(error,tolerance)+bcolors.ENDC
AssertionError: Error (7.303939e-16) above tolerance (0.000000e+00)

dutta-alankar avatar Aug 30 '24 20:08 dutta-alankar

The zero tolerance for this test is by design, and it passes the CI on the tested architecture (admittedly disabling some optimisations for nvcc). Which configuration did you test?

glesur avatar Sep 10 '24 08:09 glesur

I was trying this on my M3 MacBook Pro. I guess the exact match of floating point operations is possible only for same architecture machines?

dutta-alankar avatar Sep 10 '24 14:09 dutta-alankar

This error is most probably because LLVM enable fmad instruction on Apple Mx processors, hence it breaks the precision test. These tests should pass with absolute precision when fmad instructions are disabled (that's how CI/CD are performed). So the first question is: how can one disable fmad optimisations in Apple LLVM? (I can't test because I don't have a mac with an Mx processor).

I'm reluctant at changing the precision of these tests, as these have been life savers in the past.

glesur avatar Sep 16 '24 08:09 glesur