fds icon indicating copy to clipboard operation
fds copied to clipboard

GLMAT cases fail strict MPI checking

Open mcgratta opened this issue 5 years ago • 6 comments

If you invoke the following script on blaze or burn:

source /opt/intel20/parallel_studio_xe_2020/psxevars.sh

and then add -check_mpi to the impi_intel_linux_64_db target line:

LFLAGS = -static-intel -check_mpi

and then run Pressure_Solver/tunnel_demo_glmat.fds the case will fail with an ERROR. I fixed a few of these in main.f90, but I cannot figure out if these errors are coming from the MKL solver, or the initialization of the solver.

mcgratta avatar Sep 05 '20 21:09 mcgratta

There is a size mismatch on an MPI_Gatherv operation called within MKLMPI_Gatherv, which is an MKL internal routine used by the the cluster solver. The error happens in the first call to cluster_sparse_solver (symbolic factorization phase). I wrote an issue in the Intel MKL users forum to have them look into it.

marcosvanella avatar Sep 07 '20 15:09 marcosvanella

OK, there are other issues having to do with calls made from geom.f90. Run some geom cases that don't involve MKL and look for calls where the same send buffer is sent to multiple recipients. This is not strictly OK. To fix it, look in main.f90 and search on PRESSURE_ZONE, which is broadcast via MPI_BCAST to its neighbors. I will put the communicator array in cons.f90 so that geom.f90 can access it.

mcgratta avatar Sep 07 '20 16:09 mcgratta

I made the array of MPI communicators accessible to geom.f90.

mcgratta avatar Sep 07 '20 16:09 mcgratta

Thanks, I'll look into this.

marcosvanella avatar Sep 07 '20 18:09 marcosvanella

Kevin, I updated the geom scalar unknown exchange, PR #8736 . Try it now on your side.

marcosvanella avatar Sep 08 '20 19:09 marcosvanella

Works. Now we'll wait to see what Intel says about MKL.

mcgratta avatar Sep 08 '20 20:09 mcgratta