MPIClusterManagers.jl
MPIClusterManagers.jl copied to clipboard
Julia parallel constructs over MPI
This allows to get updates for GitHub actions automatically. I have used this for my own packages, the [Trixi.jl framework](https://github.com/trixi-framework), and the [SciML organization](https://github.com/SciML). After merging this, you could also...
`MPIClusterManagers.start_main_loop(TCP_TRANSPORT_ALL)` calls `MPIManager(np=size-1, mode=TCP_TRANSPORT_ALL)` without passing `master_tcp_interface`, so the manager only listens for worker connections on the default localhost interface and there is no way to specify a different interface...
I'm trying to use a custom worker pool (with the goal of using the master process as a worker too, so as not to waste a GPU) but getting a...
At the moment, MPIWorkerManager connects directly to all worker processes. It would probably be better if it could connect to a single MPI process, and then multiplex that connection to...
[README.md](https://github.com/JuliaParallel/MPIClusterManagers.jl/blob/master/README.md) needs to be updated. For example it states the following usage `MPIManager(;np=Sys.CPU_THREADS, mpi_cmd=false, launch_timeout=60.0)` which is wrong. - [v0.1.0](https://github.com/JuliaParallel/MPIClusterManagers.jl/tree/v0.1.0) has an optional parameter `mpirun_cmd`. [Constructor signature on v0.1.0](https://github.com/JuliaParallel/MPIClusterManagers.jl/blob/a1ac47dd28472f4aeb827ccd4e9985dc5c7109a9/src/mpimanager.jl#L53) -...
I don't know if I missed somethign but I can't seem to fund a way to return value for the " only workers execute MPI code" mode. While the example...
I tried to use MPI.jl for connecting different computing nodes, and I found that there is no options for specifying the hosts. We can specify different remote hosts with `mpiexec...
Need to implement the equivalent of https://github.com/JuliaLang/julia/pull/6768 and https://github.com/JuliaLang/julia/pull/10073 when using MPI for transport
Currently we do a busy-wait with `MPI.IProbe` - `yield()` loop, which consumes CPU cycles unnecessarily.
A few related issues: - warnings printed when using MPI transport - different finalization procedures between when using MPI_ON_WORKERS, MPI_TRANSPORT_ALL, and TCP_TRANSPORT_ALL. Implement a standard "close" function. - warning printed...