stdlib icon indicating copy to clipboard operation
stdlib copied to clipboard

Failed tests when compiling with openmp

Open jalvesz opened this issue 10 months ago • 7 comments

Description

I tested building and running the tests, including OpenMP support, by including the flag:

cmake -B build -G Ninja -DBUILD_TESTING=on -DCMAKE_Fortran_FLAGS=-fopenmp -DCMAKE_MAXIMUM_RANK:String=4 -DCMAKE_BUILD_TYPE=Release -DCMAKE_Fortran_COMPILER=gfortran
cmake --build build
ctest --test-dir build/test

several of the tests failed:

86% tests passed, 11 tests failed out of 77

Label Time Summary:
quadruple_precision    =   0.17 sec*proc (2 tests)

Total Test time (real) =  12.68 sec

The following tests FAILED:
         12 - chaining_maps (SEGFAULT)
         13 - open_maps (SEGFAULT)
         14 - maps (SEGFAULT)
         15 - intrinsics (Failed)
         30 - linalg_pseudoinverse (Failed)
         38 - blas_lapack (Failed)
         43 - sorting (Exit code 0xc0000374
)
         47 - mean (Failed)
         59 - string_intrinsic (Failed)
         64 - string_to_number (Failed)
         69 - simps (Failed)

         64 - string_to_number (Failed)
         69 - simps (Failed)
         64 - string_to_number (Failed)
         69 - simps (Failed)
         64 - string_to_number (Failed)
         69 - simps (Failed)

I wonder if one of the CI jobs should include OpenMP in order to catch such behaviours early ?

Expected Behaviour

Should pass

Version of stdlib

master

Platform and Architecture

Windows / gfortran 14.2.0

Additional Information

No response

jalvesz avatar Mar 29 '25 19:03 jalvesz

Strange that it failed on some procedures like blas_lapack or sorting. I agree that we should include OpenMP in at least one of the CI jobs.

jvdp1 avatar Mar 29 '25 19:03 jvdp1

I was looking at the intrinsics test and saw that they fail for sum and dot_product for xdp. I saw that there is a tolerance issue, for instance, if I print the tolerance and relative errors here

https://github.com/fortran-lang/stdlib/blob/60d0a769216322243e28a63b92ed7668d2df80d5/test/intrinsics/test_intrinsics.fypp#L213C1-L218C87

adding a print *, '${t}$ dot err:', tolerance, err(1:3)

I get without openmp: real(xdp) dot err: 1.08420217248550443401E-0017 3.25260651745651330202E-0019 0.00000000000000000000 0.00000000000000000000

With openmp: real(xdp) dot err: 1.08420217248550443401E-0017 2.22044604925031308085E-0016 0.00000000000000000000 5.55111512312578270212E-0016

For the latter, the errors seems to be funnily close to epsilon(0.d0) = 2.220446049250313E-016 ... I'm intrigued here, I wonder if the other tests might be suffering from something similar.

jalvesz avatar Apr 02 '25 19:04 jalvesz

Yes, unfortunately I also noted this a while ago:

https://github.com/fortran-lang/fpm/blob/7535cab6efc89dd5a294f0d9643b5eebd6b237f0/src/fpm_meta.f90#L139-L142

I have never had time to dig into the issue, though.

I don't use openmp much, but I believe every time there is a static (save) variable somewhere, that must be declared THREADPRIVATE, otherwise all threads will write to it, causing unpredictable behavior.

perazz avatar Apr 08 '25 07:04 perazz

On a different machine ( without the hash_functions tests #976 ) I got "only" the following fails when using openmp (here using GNU from msys2 instead of equation.com)

96% tests passed, 3 tests failed out of 73

Label Time Summary:
quadruple_precision    =   2.32 sec*proc (2 tests)

Total Test time (real) =  96.33 sec

The following tests FAILED:
         37 - sorting (SEGFAULT)
         60 - filesystem (Failed)
         63 - subprocess (Failed)

running: ctest --test-dir build/test --rerun-failed --output-on-failure

click to view log
1/3 Test #37: sorting ..........................***Exception: SegFault 10.68 sec
# Testing: sorting
  Starting char_ord_sorts ... (1/22)
  Starting string_ord_sorts ... (2/22)
  Starting bitset_large_ord_sorts ... (3/22)
  Starting bitset_64_ord_sorts ... (4/22)
  Starting int_radix_sorts ... (5/22)
  Starting real_radix_sorts ... (6/22)
  Starting int_sorts ... (7/22)
  Starting char_sorts ... (8/22)
  Starting string_sorts ... (9/22)
  Starting bitset_large_sorts ... (10/22)
  Starting bitset_64_sorts ... (11/22)
 ORD_SORT did not sort String Decrease.
 i =                     1
  Starting int_sort_indexes_default ... (12/22)
string_dummy(i-1:i) =
 ORD_SORT did not sort Bitset Random.
 i =                   235
bitset64_dummy(i-1:i) = 0000000000000000000000000000000000000000000000000000011110000110 0000000000000000000000000000000000000000000000000000000011101100
  Starting char_sort_indexes_default ... (13/22)
  Starting string_sort_indexes_default ... (14/22)
 reverse + work ORD_SORT did not sort Bitset Random.
 i =                     5
bitset64_dummy(i-1:i) = 0000000000000000000000000000000000000000000000000000111101000101 0000000000000000000000000000000000000000000000000000111101000111
  Starting bitset_large_sort_indexes_default ... (15/22)
 reverse + work ORD_SORT did not sort Bitset Decrease.
 i =                  2048
 SORT did not sort Bitset Decrease.
 i =                     1
bitsetl_dummy(i-1:i) = 00000000000000000000111111111111 00000000000000000000111111111110
bitset64_dummy(i-1:i) = 0000000000000000000000000000000000000000000000000000101000010001 0000000000000000000000000000000000000000000000000000000001100001
  Starting bitset_64_sort_indexes_default ... (16/22)
 RADIX_SORT did not sort Blocks.
 i =                    31
  Starting int_sort_indexes_low ... (17/22)
dummy(i-1:i)     83      0
  Starting char_sort_indexes_low ... (18/22)
 reverse ORD_SORT did not sort Bitset Random.
 i =                   537
bitset64_dummy(i-1:i) = 0000000000000000000000000000000000000000000000000000110111100111 0000000000000000000000000000000000000000000000000000110111100110
 reverse + work ORD_SORT did not sort Char. Decrease.
 i =                     1
       ... bitset_64_ord_sorts [FAILED]
  Message: Condition not fullfilled
char_dummy(i-1:i) =  pppp pppo
  Starting bitset_large_sort_indexes_low ... (20/22)
 SORT_INDEX did not sort Bitset Decrease.
 i =                     3
  Starting bitset_64_sort_indexes_low ... (21/22)
 reverse SORT did not sort Bitset Decrease.
 i =                     1
  Starting int_ord_sorts ... (22/22)
bitset64_dummy(i-1:i) = 0000000000000000000000000000000000000000000000000000101000110100 0000000000000000000000000000000000000000000000000000011011001111
bitsetl_dummy(i-1:i) = 0000000000000000000000000000000000000000000000000000111111111111 0000000000000000000000000000000000000000000000000000111111111110
       ... bitset_64_sort_indexes_default [FAILED]
  Message: Condition not fullfilled
  Starting string_sort_indexes_low ... (19/22)
       ... bitset_64_sorts [FAILED]
  Message: Condition not fullfilled
 ORD_SORT did not sort Blocks.
 i =                  2436
dummy(i-1:i)  46123  46124
 SORT_INDEX did not sort Blocks.
 i =                   256
a(index_low(i-1:i)   3805   3804
 SORT_INDEX did not sort Char. Decrease.
 i =                  4806
char_dummy(i-1:i) onkg dilo
       ... char_sort_indexes_default [FAILED]
  Message: Condition not fullfilled
 reverse RADIX_SORT did not sort Blocks.
 i =                   427
dummy(i-1:i)  65109  65108
       ... int_radix_sorts [FAILED]
  Message: Condition not fullfilled
       ... bitset_64_sort_indexes_low [PASSED]
 SORT_INDEX did not sort Blocks.
 i =                     2
a(index_default(i-  65534  65533
 reverse + work ORD_SORT did not sort Char. Decrease.
 i =                     1
char_dummy(i-1:i) =  afnn cfpi
       ... char_ord_sorts [FAILED]
  Message: Condition not fullfilled
 SORT did not sort Blocks.
 i =                  8437
dummy(i-1:i)   4709   4708
 SORT_INDEX did not sort String Decrease.
 i =                   229
string_dummy(i-1:
 SORT_INDEX did not sort Char. Decrease.
 i =                  7427
       ... string_sort_indexes_default [FAILED]
  Message: Condition not fullfilled
char_dummy(i-1:i) enme enif
       ... char_sort_indexes_low [FAILED]
  Message: Condition not fullfilled
 reverse + work ORD_SORT did not sort Blocks.
 i =                  2889
dummy(i-1:i)   8638  35054
       ... real_radix_sorts [PASSED]
 reverse + work ORD_SORT did not sort String Decrease.

    Start 60: filesystem
2/3 Test #60: filesystem .......................***Failed    0.65 sec
# Testing: filesystem
  Starting fs_is_directory_dir ... (1/2)
  Starting fs_is_directory_file ... (2/2)
       ... fs_is_directory_file [FAILED]
  Message: Cannot delete test file: File cannot be deleted
       ... fs_is_directory_dir [PASSED]
1 test(s) failed!
ERROR STOP

Error termination. Backtrace:
#0  0xd7e05dac in ???
#1  0xd7d819d1 in ???
#2  0xd7c7ed5b in ???
#3  0x14962cb5 in ???
#4  0x14962d01 in ???
#5  0x14961318 in __tmainCRTStartup
        at D:/M/B/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:259
#6  0x14961425 in mainCRTStartup
        at D:/M/B/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:179
#7  0xadde259c in ???
#8  0xaf12af37 in ???
#9  0xffffffff in ???

    Start 63: subprocess
3/3 Test #63: subprocess .......................   Passed    1.66 sec

33% tests passed, 2 tests failed out of 3

Total Test time (real) =  13.24 sec

The following tests FAILED:
         37 - sorting (SEGFAULT)
         60 - filesystem (Failed)

jalvesz avatar Apr 11 '25 16:04 jalvesz

Regarding the filesystem tests, it would seem like it may be enough to ensure that the test file name is different from each thread.

perazz avatar Apr 11 '25 19:04 perazz

I don't use openmp much, but I believe every time there is a static (save) variable somewhere, that must be declared THREADPRIVATE, otherwise all threads will write to it, causing unpredictable behavior.

As you are mentioning, the problem can arise only if the saved variable (which can be a module variable, which is saved by design) is written, there's no issue when reading the variable. But as a general rule, given the importance of multithreading in HPC nowadays, the thread-safety status of all stdlib routines should be mentioned: which ones are thread-safe, which ones are not.

PierUgit avatar Apr 14 '25 12:04 PierUgit

Regarding the filesystem tests, it would seem like it may be enough to ensure that the test file name is different from each thread.

I would have said better to make it such that the deletion is executed by a single thread, like adding !$omp single where appropriate, no?

jalvesz avatar Apr 15 '25 17:04 jalvesz