tks2004

Results 9 comments of tks2004

Tried modifying enqueue.cc at line no: 1273) as well. did not get the algo or communication protocol

It is 4 GPUs and was run using all 8 GCD. Message sizes are from 2MB to 2GB; NCCL INFO Connected all trees NCCL INFO threadThresholds 8/8/64 | 128/8/64 |...

@haripriya-amd, rccl-tests are failing if CUSTOM_RCCL_LIB is used.

Was able to work around this issue. How does MSCCL is selected, I now dont see it being selected,

Is there an option to force use MSCCL algorithms

This is on MI250x (4 GPUs; 8GCDs) using Slingshot interconnect. Had used latest rccl, rccl-tests and aws-ofi-rccl from master branch.

RCCL_MSCCL_FORCE_ENABLE=1 did not force MSCCL algorithm

Was able to get the protocol and algorithm info.