Michael Heinz

Results 4 issues of Michael Heinz

Hey, guys. I have a user who is used to handling process placement by simply listing hosts multiple times in the hostfile so that, for example, if he wanted 2...

Hey, guys. A question came up about how the OMPI OFI MTL and BTL handle ensuring that CUDA/HMEM buffers are completely in sync at the end of a data transfer....

question

As a part of discussing #7699, it was pointed out that there are several error paths in the OFI MTL that call exit() and several in the Portals MTL that...

bug
Target: main
Target: v5.0.x

We've had success using torch-ccl with resnet and other AI workloads to test with libfabric over psm3 but when we try to use libmlx-fi.so, torch-ccl does not seem to see...