UCS: stats performance enhancements (contig. allocation)
Signed-off-by: Alex Margolin [email protected]
What
Make both the stats nodes and the counters are more or less contiguous in (virtual) memory.
Why ?
An effort to reduce the average latency of accessing stats, and also to put all the counters in a single buffer so it could be RDMA-ed instead of UDP-ed.
How ?
Both filter and regular stats nodes are allocated from a memory pool (the first chunk should cover common UCX initialization), and the counters are allocated in a flexible array (so there's one buffer containing all of them).
Can one of the admins verify this patch?
ok to test
ok to test
Hi @yosefe ,
Can anyone review this PR?
Regards and thanks, Shuki