James Dinan
James Dinan
As Brian said, MPI applications follow the GDR rules above, so CUDA deals with the GPU's memory model gotchas for you. NCCL bends some of these rules -- notably, it...
@mwheinz Yes, I'm involved in this work. There's an issue in the libfabric stack that we haven't been able to identify. We aren't able to reproduce the cudaDeviceFlushGPUDirectRDMAWrites hang with...
The Open UCX team has started maintaining their own fork of XPMEM. Your issue might get more attention if you post it there: https://github.com/openucx/xpmem
@hjelmn As a user, it would be great to have a single source for XPMEM. I don't have an opinion on who should own the repository, but would prefer to...
This is a good change. Would you be willing to post a PR for it? Can you elaborate on the symmetric heap size limitation you ran into?
Got it -- happy to accept any additional patches needed to make this work.
Your presumption is correct. We saw some pretty unintelligible compiler errors when the type was not recognized, and added `shmem_ctx_c11_generic_selection_failed` to give users a breadcrumb for the cause of the...
These checks are done during initialization to ensure the provider supports all of the atomic operations required by SOS. Reductions are included to support the reduction implementations that use AMOs.
Ref https://github.com/openshmem-org/specification/issues/313
Might be something useful we can crib from here: https://github.com/jeffhammond/HPCInfo/blob/master/buildscripts/icc-release.sh