libmesh icon indicating copy to clipboard operation
libmesh copied to clipboard

Configuration problems with 1.7.1 and not 1.7.0

Open drwells opened this issue 3 years ago • 7 comments

Perhaps related to #3357 (hi @amneetb)

At Amneet's request I checked out 1.7.1 and there is some issue with it detecting a system installation of MPI (i.e., I want to use /usr/include/mpi.h). I see some interesting lines in config.log:

configure:48297: checking for MPI_Init in -lmpi
configure:48322: mpic++ -o conftest -Wno-implicit-fallthrough   conftest.cpp -lmpi  -L/lib  >&5
configure:48322: $? = 0
configure:48331: result: yes
configure:48338: result: Found valid MPI installation...
configure:48639: result: Could not find MPI header <mpi.h>...

I don't really understand the logic but the check in acsm_mpi.m4 requires that MPI_INCLUDES_PATH get set at some point - it doesn't look like that happens, which causes MPI detection to fail at that point.

This issue doesn't exist with 1.7.0.

drwells avatar Aug 01 '22 15:08 drwells

Could you double-check that you've done a recursive init on the submodules? #3357 looks like it might have just been a mismatch there.

the check in acsm_mpi.m4 requires that MPI_INCLUDES_PATH get set at some point - it doesn't look like that happens

That line is inside the [test -n "$MPI_LIBS_PATH" -a -n "$MPI_INCLUDES_PATH"] condition's true block, isn't it? So it should never be invoked unless we have something set there, either from your environment or from acsm_compiler_control_args.m4 (which looks for $MPI in your environment, --with-mpi, or --with-mpi-include, IIRC in the opposite order).

roystgnr avatar Aug 04 '22 21:08 roystgnr

Yup - if I try things on branch_v1.7.1 then things work with updated submodules. They do not work in the release tarball.

drwells avatar Aug 08 '22 15:08 drwells

It's possible the directory that I ran make dist in was not recursively up-to-date? Our buildbox that usually makes the release tarballs currently has an error, so I made them by hand. I can try again...

jwpeterson avatar Aug 08 '22 15:08 jwpeterson

OK, I deleted the original 1.7.1 tarballs and regenerated/reuploaded them after making sure all the submodules were recursively up-to-date. The md5 sums of the files changed, so presumably something is actually different this time :grimacing:

a86901c65f640d841b5a63289ea93840  (old libmesh-1.7.1.tar.bz2)
4dcf325975db5e60abe1384ab5fb8542  (old libmesh-1.7.1.tar.gz)
0fc304be568db558f4f6256771823799  (old libmesh-1.7.1.tar.xz)
39acfb2492a4e0563226191ae98b10f2  (new libmesh-1.7.1.tar.bz2)
4af9ee0bebfb67f9d944d9e0585af068  (new libmesh-1.7.1.tar.gz)
0cb49ffede7a1bef572dfa431aef922f  (new libmesh-1.7.1.tar.xz)

jwpeterson avatar Aug 08 '22 15:08 jwpeterson

I also just tested running ./configure from one of the new tarballs and it seems to work fine for me... so hopefully this fixes the problem for you as well, and sorry about the mixup.

jwpeterson avatar Aug 08 '22 15:08 jwpeterson

This still doesn't work with my setup and I get the same problem on the development branch too. Here's how I call configure:

set -e

export CC=mpicc
export CXX=mpic++
export FC=mpifort
export F77=mpifort

export PETSC_DIR=$HOME/Applications/petsc-3.7.7/x86_64-debug

unset SLEPC_DIR

mkdir -p objs-debug
cd objs-debug
../configure                                   \
    CCFLAGS="-Wno-implicit-fallthrough"        \
    CXXFLAGS="-Wno-implicit-fallthrough"       \
  --prefix=$HOME/Applications/libmesh-dev.g/   \
  --with-methods=dbg                           \
  --enable-triangle                            \
  --enable-exodus=yes                          \
  --enable-timestamps=no                       \
  --disable-glibcxx-debugging                  \
  --enable-poly2tri=no                         \
  --disable-openmp                             \
  --disable-perflog                            \
  --disable-pthreads                           \
  --disable-tbb                                \
  --disable-trilinos                           \
  --disable-boost                              \
  --with-thread-model=none                     \
  --disable-reference-counting                 \
  --disable-slepc                              \
  --disable-strict-lgpl                        \
  --disable-vtk                                \
  --disable-eigen                              \
  --with-metis=PETSc                           \
  --disable-deprecated                         \
  --disable-hdf5

make -j4
make install

and here's the log (also config.log

Things work correctly on my mac - I suspect the MPI detection is still getting confused over paths (I have /usr/include/mpi.h and /usr/bin/mpic++ - i.e., its not in some custom directory).

drwells avatar Aug 18 '22 16:08 drwells

and I get the same problem on the development branch too

If branch_v1.7.1 is working but devel is failing, is it something you can bisect? I'm really not sure how to debug this remotely.

roystgnr avatar Aug 25 '22 17:08 roystgnr