rccl
rccl copied to clipboard
Change unroll for gfx94x
Details
Do not mention proprietary info or link to internal work items in this PR.
Work item: "Internal", or link to GitHub issue (if applicable).
What were the changes?
Change unroll for gfx94x
Why were the changes made?
Improve RCCL performance when GPU clock is lower under heavy workload
How was the outcome achieved?
Increase unroll factor
Additional Documentation:
Approval Checklist
Do not approve until these items are satisfied.
- [ ] Verify the CHANGELOG has been updated, if
- there are any NCCL API version changes,
- any changes impact library users, and/or
- any changes impact any other ROCm library.
Is this draft no longer valid @wenkaidu ?
this is used to track SWDEV-469533 which still open in mainline build
not intended for merge