rccl
rccl copied to clipboard
Allow zero byte sendrecv in alltoallv
Details
Do not mention proprietary info or link to internal work items in this PR.
Work item: "Internal", or link to GitHub issue (if applicable). Internal
What were the changes?
Allow zero byte sendrecv in alltoallv
Why were the changes made? From PyTorch code: https://github.com/NVIDIA/nccl/issues/696. The issue of skipping send/recv is that it can cause deadlock when a rank send and recv 0 bytes so it's completely skipping the collective, causing mismatch across ranks
How was the outcome achieved?
Allow zero byte sendrecv in alltoallv
Additional Documentation:
What else should the reviewer know?
Approval Checklist
Do not approve until these items are satisfied.
- [ ] Verify the CHANGELOG has been updated, if
- there are any NCCL API version changes,
- any changes impact library users, and/or
- any changes impact any other ROCm library.