Hugo Meiland
Hugo Meiland
Happy to help here; already did the same for https://github.com/cvmfs/cvmfs.
Looks like a recursive load of another module outside of modulepath is causing issues; either copy original module to modulepath or create symlinks could solve this...
proposal https://github.com/Azure/azhpc-images/pull/102 made for CentOS 7.9; if ok, I'll update other dists/versions...
added moving from environment-modules to Lmod in PR #102; this would break some modules in outside CentOS 7.9; so please allow me to finish the PR before accepting....
ok, so it looks like we need an archspec.interconnect
would an `if test -d /sys/class/infiniband; then export OMPI_MCA_pml=ucx; fi be enough? In lua something like path.exists("/sys/class/infiniband")? Looks like this path is not enough; but when checking for /sys/class/infiniband/mlx5_ib0 it...
> Do you know which interface the OFI component was complaining about? I'm expecting the ib0 to be used: [EESSI pilot 2021.06] $ ip a 1: lo: mtu 65536 qdisc...
[EESSI pilot 2021.06] $ mpirun --version mpirun (Open MPI) 4.0.3 libfabric/1.11.0 The Mellanox ConnectX5 is the physical device, and is available through the kernel as /sys/class/infiniband/mlx5_ib0 I'm not seeing any...
> Let me inquire with others in the Open MPI community and get back to you. > > Can you try upgrading to Open MPI v4.1.1? I have a very...
Looks like it is solved in the OpenMPI 4.1.1 with libfabric 1.12.1, which are included in EasyBuild gompi/2021a ``` $ mpirun --version mpirun (Open MPI) 4.1.1 Report bugs to http://www.open-mpi.org/community/help/...