valentin petrov
valentin petrov
V1.1.x of #616
V1.1.x of #600
## What Unifies pipelining parameters. Adds ucc_pipeline_params_t and the interface for user to set them + cfg var parser. ## Why ? Each time we add another pipelined alg we...
## What Properly handle potential failures that happen during TL context_create_epilog call. ## Why ? Current behavior: if context_create_epilog fails -> ucc context creation fails -> job fails. Expected behavior:...
## What Adds new TL/MLX5: minimal necessary tl iface stubs w/o much actual implementation (added in next PRs). Adds option to provide negate sign "^" to the --with-tls. Default list...
## What Potential Alternative for #596 . This PR implements ALL the reductinos (dt/ops) in the ec/cuda executor for persistent mode. It is done by making "device template" functions (common...
## What Adds pipelining support for RAB allreduce algorithm in CL/HIER ## Why ? Potential perf improvement. E.g. we can use TL/SHM for larger msg sizes with pipelining (when single...
## What Fixes config file parsing with respect to inherited variables ## Why ? When the variable in the config table is inherited from another parent table (e.g., TLS var...
## What Implements Shared PD initialization in TL/MLX5
## What Custom IB WQEs implementation: transpose, wait_on_data, umr, rdma