Tests fail when using new `ifx` compiler
Description
When compiling with the new ifx compiler, some tests are failing. This is preventing us from updating our docker images as part of https://github.com/mdolab/docker/pull/266.
Steps to reproduce issue
- Pull the docker container mdolab/public:u22-intel-impi-latest-amd64-failed (specifically sha256:d318081f9bf4cc2c110685d4592fe6ee2b0f7799aff8cbefe87e138d04b224b7)
- In
~/repos/cmplxfoilruntestflo -v -n 1 .
Current behavior
When running one of the failed tests, for example, testflo -n 1 -s -v ./tests/test_solver_class.py:TestDerivativesCST.test_alpha_sens on the docker container the following error is printed
mdolabuser@a6661930c69d:~/repos/cmplxfoil$ testflo -n 1 -s -v ./tests/test_solver_class.py:TestDerivativesCST.test_alpha_sens
######## Fitting CST coefficients to coordinates in /home/mdolabuser/repos/cmplxfoil/tests/n0012BluntTE.dat ########
Upper surface
L2 norm of coordinates in dat file versus fit coordinates: 0.0003504064468410577
Fit CST coefficients: [0.16601024 0.13092967]
Lower surface
L2 norm of coordinates in dat file versus fit coordinates: 0.0003504064499657832
Fit CST coefficients: [-0.16601024 -0.13092967]
+----------------------------------------------------------------------+
| Switching to Aero Problem: fc |
+----------------------------------------------------------------------+
LEXITFLAG TRUE, GOING TO 90...
LEXITFLAG TRUE, GOING TO 90...
LEXITFLAG TRUE, GOING TO 90...
LEXITFLAG TRUE, GOING TO 90...
LEXITFLAG TRUE, GOING TO 90...
LEXITFLAG TRUE, GOING TO 90...
LEXITFLAG TRUE, GOING TO 90...
./tests/test_solver_class.py:TestDerivativesCST.test_alpha_sens ... FAIL (00:00:5.03, 172 MB)
Traceback (most recent call last):
File "/home/mdolabuser/repos/cmplxfoil/./tests/test_solver_class.py", line 369, in test_alpha_sens
np.testing.assert_allclose(checkSensFD, actualSensCS, rtol=relTol, atol=absTol)
File "/home/mdolabuser/.pyenv/versions/3.11.9/lib/python3.11/site-packages/numpy/testing/_private/utils.py", line 1504, in assert_allclose
assert_array_compare(compare, actual, desired, err_msg=str(err_msg),
File "/home/mdolabuser/.pyenv/versions/3.11.9/lib/python3.11/contextlib.py", line 81, in inner
return func(*args, **kwds)
^^^^^^^^^^^^^^^^^^^
File "/home/mdolabuser/.pyenv/versions/3.11.9/lib/python3.11/site-packages/numpy/testing/_private/utils.py", line 797, in assert_array_compare
raise AssertionError(msg)
AssertionError:
Not equal to tolerance rtol=0.002, atol=1e-05
Mismatched elements: 1 / 1 (100%)
Max absolute difference: 10005205.62637494
Max relative difference: inf
x: array(10005205.626375)
y: array(0.)
The following tests failed:
test_solver_class.py:TestDerivativesCST.test_alpha_sens
Passed: 0
Failed: 1
Skipped: 0
Ran 1 test using 1 processes
Wall clock time: 00:00:5.76
Expected behavior
All tests should pass
Observations
- The build process looks to be a bit messy when using intel. There seems to be a mix of compilers used,
gccfor interface c-code,ifxfor compiling source andifortfor library. While this is probably not an issue, we should address this. - Since this is a f77 code, its possible that we have encountered an issue when using
ifxthat we have not encountered yet on other repositories, since they are mostly >f90. The porting guide might help, but it states that f77 is completely implemented. - I did some minor tests, and it seems that just removing any optimization, i.e., change from
-O2to-O0and rebuilding, makes the tests pass. This indicates that some optimization is affecting the code when usingifxthat does not show up withifortfor some reason. I would appreciate it if someone can dig into this and identify the issue and possible solutions.
~~To add to the confusion, if you build cmplxfoil using the gcc config file on these images, the tests still fail, even though every part of the build process is done using a gcc compiler (either gcc or gfortran). See the attached log below.~~
~~Given that the tests don't fail on the GCC images, what is different between the intel and gcc images that could be causing gcc compiled code to behave differently?~~
Ignore the above, I was forgetting to pip install cmplxfoil again after rebuilding with gcc, after doing that the tests pass, so this is just an intel compiler issue.
This is tenuous at best, but seems like as of 2023, ifx was known to not work very well with complex numbers, at least from a performance perspective.
Also, some of the default floating point arithmetic behaviour is different between ifort and ifx, ifort checks for NaNs by default while ifx doesn't
The NaN check might be a problem since -fp-model=fast is the default. Even though I did not report it here, I feel I did run this with precise and strict at some point, and it did not have any effect. This we can test.
The issue with ifx and complex numbers and optimization seems like a more possible explanation, since the code does seem to work with no optimizations -O0. Its possible that a newer compiler version will fix this, but we should check the compiler release notes.
Did a very quick test with the latest image testing these combinations of optimization flags and floating point models. All pass, but only when the optimizations are turned off.
| opt flag | fast | precise | strict |
|---|---|---|---|
| O0 | pass | pass | pass |
| O1 | fail | fail | fail |
| O2 | fail | fail | fail |
Damn, it's not that then. I'm also not fully convinced this is purely a complex number issue either, as some of the failing tests don't involve the complexified code.
@eirikurj , as a fallback, and to avoid holding up https://github.com/mdolab/docker/pull/266 any more we could just change the logic in the intel config file so that we use ifort -O2 if available and ifx -O0 if not?
I've implemented this in https://github.com/mdolab/CMPLXFOIL/pull/33