Add `relay::mpi::isend_using_schema`
I'm developing an mpi application where ranks need to send conduit nodes to each other in a point-to-point manner. E.g. imagine a ring where each rank sends a node to its left neighbor and receives from its right neighbor.
This was causing the application to deadlock with using send_using_schema and recv_using_schema.
I was able to resolve by adapting the send_using_schema to an isend_using_schema
(i.e. using MPI_Isend instead of MPI_Send).
Should I add this to conduit::relay::mpi?
Yes, knowing how you approached this would be great!
It might be helpful to review:
https://github.com/LLNL/conduit/pull/433
For isend/irecv - the a prior constraints are hard to tackle for the general case.
For the sync case, every pair has to post their send/recv in the same order, or else deadlock. For your case - isend, irecv is the right approach for sure.
General solutions for isend/irecv I have seen (outside of conduit) use fixed size buffers, and then they manage chunking generic sized payloads into these buffers. Then they use a continuous wait/polling pattern until everything arrives.
It would be great to get a general strategy into conduit -- even if it ends up being a bit more complex than just transactions of isend, irecv +iwait.
This issue is also related: https://github.com/LLNL/conduit/issues/170
(we concluded we don't want any-to-any, we want the general isend/irecv solution or strategy to share)
Thanks for the pointers @cyrush
I'm currently using a non-blocking send (MPI_Isend) with a blocking receive (MPI_Recv).
I think I still need to tweak my solution a bit since some runs are getting invalid conduit nodes on the receive side (see below).
Here's what I currently have: https://github.com/LLNL/axom/blob/cf11f5210d9751e785be4bd05e1b0638897cbeb4/src/axom/quest/DistributedClosestPoint.hpp#L78-L145
The only difference to send_using_schema is here:
https://github.com/LLNL/axom/blob/cf11f5210d9751e785be4bd05e1b0638897cbeb4/src/axom/quest/DistributedClosestPoint.hpp#L121-L130
And here's how I'm using it: https://github.com/LLNL/axom/blob/cf11f5210d9751e785be4bd05e1b0638897cbeb4/src/axom/quest/DistributedClosestPoint.hpp#L326-L337
I think this has to change to make the MPI_Request a parameter to isend_using_schema which is freed after the recv_using_schema, e.g. by sending a [blocking] "acknowledge" message from the receiver back to the sender.
I agree with your comment on #170 that these calls need to be synchronized and a tutorial on how to use them would be really helpful.
I updated my solution based on send_using_schema and isend along with its associated conduit::relay::mpi::Request struct to ensure the temporary nodes/schemas survive the send/receive loop.
Here's the update:
- Combined isend/recv: https://github.com/LLNL/axom/blob/7d5e7a99644a93dabc166c494897fc61f6df44ef/src/axom/quest/DistributedClosestPoint.hpp#L158-L175
- Updated
isend_using_schemawith bugfixes adapted fromconduit::relay::mpi::isendhttps://github.com/LLNL/axom/blob/7d5e7a99644a93dabc166c494897fc61f6df44ef/src/axom/quest/DistributedClosestPoint.hpp#L82-L156 - Usage: https://github.com/LLNL/axom/blob/7d5e7a99644a93dabc166c494897fc61f6df44ef/src/axom/quest/DistributedClosestPoint.hpp#L358-L365
At this point, I'm reasonably confident that my code is working, but I'm no longer sure if my solution is general enough to push to conduit ...