ucc icon indicating copy to clipboard operation
ucc copied to clipboard

Use collective active_set to send data to self

Open marsaev opened this issue 3 years ago • 1 comments

Currently we use active_set field to "emulate" p2p communication:

    ucc_coll_args_t coll = {0};
    coll.mask = UCC_COLL_ARGS_FIELD_FLAGS | UCC_COLL_ARGS_FIELD_ACTIVE_SET | UCC_COLL_ARGS_FIELD_FLAGS;
    coll.flags = UCC_COLL_ARGS_FLAG_COUNT_64BIT;
    coll.coll_type = UCC_COLL_TYPE_BCAST;
    coll.root = root_rank;
    coll.tag = tag;
    coll.src.info.buffer = ptr;
    coll.src.info.count = size;
    coll.src.info.datatype = UCC_DT_UINT8;
    coll.src.info.mem_type = UCC_MEMORY_TYPE_CUDA;

    coll.active_set.size = 2;
    coll.active_set.start = my_rank;
    coll.active_set.stride = my_rank - peer_rank;

however I don't see a way to set up coll properly to "send" to myself for this code to work: https://github.com/openucx/ucc/blob/master/src/utils/ucc_coll_utils.h#L270-L271

Currently i have workaround that uses cudaMemcpyAsync instead of UCC when i try to do something like this, but for composability it would be nice for UCC to handle this case:

for (msg: messages)
    if (myrank == msg.src)
        ucc.send(msg.dst, ...)
    if (myrank == msg.dst)
        ucc.recv(msg.src, ...)
wait_all_ucc_requests()

marsaev avatar Nov 02 '22 12:11 marsaev

From the API perspecitve: active_set.size = 1; start = my_rank; stride = 1; should probably work. But it might not be implemented currectly in TL/UCP currently.

vspetrov avatar Nov 20 '22 18:11 vspetrov