hiop icon indicating copy to clipboard operation
hiop copied to clipboard

`hiop~mpi` includes MPI headers

Open cameronrutherford opened this issue 2 years ago • 6 comments

https://github.com/LLNL/hiop/blob/develop/src/Interface/hiopInterface.hpp#L60

https://github.com/pnnl/ExaGO/actions/runs/6304772661/job/17116842345?pr=15

This is a really weird bug, as even when building petsc~mpi in the exago package here, petsc insists on having an mpi.h lying around that is also picked up...

I am still trying to figure out who to blame here, but this seemed like the right place to start.

cameronrutherford avatar Sep 25 '23 22:09 cameronrutherford

are you somehow building hiop without MPI? or different mpi headers are with hiop and petsc

cnpetra avatar Sep 26 '23 03:09 cnpetra

are you somehow building hiop without MPI? or different mpi headers are with hiop and petsc

From the ExaGO pipeline, we are building:

exago@develop+hiop~ipopt~mpi~python+raja+tests arch=None-None-x86_64
 -   tx7nd5d  exago@develop%[email protected]~cuda+hiop~ipo~ipopt+logging~mpi~python+raja~rocm+tests build_system=cmake build_type=RelWithDebInfo dev_path=/__w/ExaGO/ExaGO arch=linux-ubuntu20.04-x86_64
 -   ybikngp      ^[email protected]%[email protected]~cuda~ipo+openmp~rocm~tests build_system=cmake build_type=RelWithDebInfo arch=linux-ubuntu20.04-x86_64
 -   kl43gwj          ^[email protected]%[email protected] build_system=generic arch=linux-ubuntu20.04-x86_64
 -   7bzaewm      ^[email protected]%[email protected]~doc+ncurses+ownlibs~qt build_system=generic build_type=Release arch=linux-ubuntu20.04-x86_64
 -   3bxcabf          ^[email protected]%[email protected]~symlinks+termlib abi=none build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   yekhgie          ^[email protected]%[email protected]~docs~shared build_system=generic certs=mozilla arch=linux-ubuntu20.04-x86_64
 -   djeruao              ^ca-certificates-mozilla@2023-01-10%[email protected] build_system=generic arch=linux-ubuntu20.04-x86_64
 -   jymwj6w              ^[email protected]%[email protected]+optimize+pic+shared build_system=makefile arch=linux-ubuntu20.04-x86_64
 -   vcsqn5o      ^[email protected]%[email protected]~cuda+deepchecking~ginkgo~ipo~jsrun~kron~mpi+raja~rocm~shared~sparse build_system=cmake build_type=RelWithDebInfo arch=linux-ubuntu20.04-x86_64
 -   wtvhbiz      ^[email protected]%[email protected]~bignuma~consistent_fpcsr+fortran~ilp64+locking+pic+shared build_system=makefile patches=114f95f,a4c642f,c20f518,d3d9b15 symbol_suffix=none threads=none arch=linux-ubuntu20.04-x86_64
 -   5qydzbx          ^[email protected]%[email protected]+cpanm+open+shared+threads build_system=generic arch=linux-ubuntu20.04-x86_64
 -   e5g7oef              ^[email protected]%[email protected]+cxx~docs+stl build_system=autotools patches=26090f4,b231fcc arch=linux-ubuntu20.04-x86_64
 -   gs4r33x              ^[email protected]%[email protected]~debug~pic+shared build_system=generic arch=linux-ubuntu20.04-x86_64
 -   7wdyruu              ^[email protected]%[email protected] build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   wslvyrk      ^[email protected]%[email protected]~X~batch~cgns~complex~cuda~debug+double~exodusii~fftw+fortran~giflib~hdf5~hpddm~hwloc~hypre~int64~jpeg~knl~kokkos~libpng~libyaml~memkind+metis~mkl-pardiso~mmg~moab~mpfr~mpi~mumps~openmp~p4est~parmmg~ptscotch~random123~rocm~saws~scalapack+shared~strumpack~suite-sparse~superlu-dist~tetgen~trilinos~valgrind build_system=generic clanguage=C arch=linux-ubuntu20.04-x86_64
 -   kwz7ftm          ^[email protected]%[email protected] build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   y4xrp3s              ^[email protected]%[email protected] build_system=autotools libs=shared,static arch=linux-ubuntu20.04-x86_64
 -   wnqabk7          ^[email protected]%[email protected]~gdb~int64~ipo~real64+shared build_system=cmake build_type=RelWithDebInfo patches=4991da9,93a7903,b1225da arch=linux-ubuntu20.04-x86_64
 -   quyjgw3          ^[email protected]%[email protected]+bz2+crypt+ctypes+dbm~debug+libxml2+lzma~nis~optimizations+pic+pyexpat+pythoncmd+readline+shared+sqlite3+ssl~tkinter+uuid+zlib build_system=generic patches=0d98e93,7d40923,f2fd060 arch=linux-ubuntu20.04-x86_64
 -   pgvwni4              ^[email protected]%[email protected]+libbsd build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   en3zuay                  ^[email protected]%[email protected] build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   ps7sxlx                      ^[email protected]%[email protected] build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   wlq5rko              ^[email protected]%[email protected]+bzip2+curses+git~libunistring+libxml2+tar+xz build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   j6aqcps                  ^[email protected]%[email protected]~python build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   zt4ocio                  ^[email protected]%[email protected] build_system=autotools zip=pigz arch=linux-ubuntu20.04-x86_64
 -   xoxeujp                      ^[email protected]%[email protected] build_system=makefile arch=linux-ubuntu20.04-x86_64
 -   3vtuapf                      ^[email protected]%[email protected]+programs build_system=makefile compression=none libs=shared,static arch=linux-ubuntu20.04-x86_64
 -   6sswith              ^[email protected]%[email protected] build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   2evlwmd              ^[email protected]%[email protected]~obsolete_api build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   iuswzm4              ^[email protected]%[email protected] build_system=autotools patches=bbf97f1 arch=linux-ubuntu20.04-x86_64
 -   ghcuaen              ^[email protected]%[email protected]+column_metadata+dynamic_extensions+fts~functions+rtree build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   swhrnzy              ^[email protected]%[email protected] build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   qkxtzoa              ^[email protected]%[email protected]~pic build_system=autotools libs=shared,static arch=linux-ubuntu20.04-x86_64
 -   w6opye6      ^[email protected]%[email protected] build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   7xqyl5b      ^[email protected]%[email protected]~cuda+examples+exercises~ipo+openmp~rocm+shared~tests build_system=cmake build_type=RelWithDebInfo arch=linux-ubuntu20.04-x86_64
 -   4s36yj3      ^[email protected]%[email protected]+c~cuda~device_alloc~deviceconst+examples~fortran~ipo~numa~openmp~rocm+shared build_system=cmake build_type=RelWithDebInfo tests=none arch=linux-ubuntu20.04-x86_64

And so we get the backtrace:


     459    In file included from /__w/ExaGO/ExaGO/tpl/spack/opt/spack/linux-ub
            untu20.04-x86_64/gcc-9.4.0/hiop-0.7.1-vcsqn5ocwcwfihlrbjqozv3ku2rkg
            zo7/include/hiopInterface.hpp:60,
     460                     from /__w/ExaGO/ExaGO/tpl/spack/opt/spack/linux-ub
            untu20.04-x86_64/gcc-9.4.0/hiop-0.7.1-vcsqn5ocwcwfihlrbjqozv3ku2rkg
            zo7/include/hiopNlpFormulation.hpp:59,
     461                     from /__w/ExaGO/ExaGO/tpl/spack/opt/spack/linux-ub
            untu20.04-x86_64/gcc-9.4.0/hiop-0.7.1-vcsqn5ocwcwfihlrbjqozv3ku2rkg
            zo7/include/hiopAlgFilterIPM.hpp:59,
     462                     from /__w/ExaGO/ExaGO/src/opflow/solver/hiop/opflo
            w_hiop.h:7,
     463                     from /__w/ExaGO/ExaGO/src/opflow/solver/hiop/opflo
            w_hiop.cpp:4:
  >> 464    /__w/ExaGO/ExaGO/tpl/spack/opt/spack/linux-ubuntu20.04-x86_64/gcc-9
            .4.0/petsc-3.18.3-wslvyrkkwofiig24a5rm7gctadb7g4fk/include/petsc/mp
            iuni/mpi.h:186:13: error: multiple types in one declaration
     465      186 | typedef int MPI_Comm;
     466          |             ^~~~~~~~
  >> 467    /__w/ExaGO/ExaGO/tpl/spack/opt/spack/linux-ubuntu20.04-x86_64/gcc-9
            .4.0/petsc-3.18.3-wslvyrkkwofiig24a5rm7gctadb7g4fk/include/petsc/mp
            iuni/mpi.h:186:13: error: declaration does not declare anything [-f
            permissive]
  >> 468    make[2]: *** [src/opflow/CMakeFiles/OPFLOW_obj_static.dir/build.mak
            e:261: src/opflow/CMakeFiles/OPFLOW_obj_static.dir/solver/hiop/opfl
            ow_hiop.cpp.o] Error 1

So the HiOp header hiopInterface.hpp on line 60 (linked in the issue description originally) is including hiopMPI.h, which is then including mpi.h. This looks for any header, and picks up a random PETSc one which errors out.

We are building PETSc and HiOp without MPI here, so I honestly think this could be a HiOp and a PETSc bug?

cameronrutherford avatar Sep 26 '23 14:09 cameronrutherford

@cnpetra @cameronrutherford I can successfully build HiOp without MPI. In hiopMPI.h, mpi.h is not included if we set HIOP_USE_MPI = OFF.

From your log file, I think the problems are:

  1. When HIOP_USE_MPI = OFF, both HiOp and PETSc define their own MPI_Comm.
  2. Not sure where mpi.h is included. Seems to be it is included via PETSc.

See here

nychiang avatar Sep 26 '23 17:09 nychiang

@cnpetra @cameronrutherford

I can successfully build HiOp without MPI.

In hiopMPI.h, mpi.h is not included if we set HIOP_USE_MPI = OFF.

From your log file, I think the problems are:

  1. When HIOP_USE_MPI = OFF, both HiOp and PETSc define their own MPI_Comm.

  2. Not sure where mpi.h is included. Seems to be it is included via PETSc.

See here

I'm following, but some clarification. I am also able to build hiop~mpi, but issue only happens when exago~mpi tries to build with both petsc~mpi and hiop~mpi.

Why do HiOp and PETSc both need to define MPI_Comm in these non-mpi builds?

cameronrutherford avatar Sep 26 '23 18:09 cameronrutherford

Again this might technically be an ExaGO (or PETSc or HiOp) issue, but trying to figure out who's to blame here

cameronrutherford avatar Sep 26 '23 18:09 cameronrutherford

we had this issue before with mfem if I recall correctly. One the defines has to go. I think HiOp can take with however petsc defines MPI_Comm. So an easy fix would be for HiOp to check if already defined. This is for when HIOP_USE_MPI is off.

cnpetra avatar Sep 27 '23 02:09 cnpetra