[Usage]: Does Mooncake support npu->gpu transport currently ?
Describe your usage question
Mooncake v0.3.7.post2
Refer to https://kvcache-ai.github.io/Mooncake/getting_started/quick-start.html#transfer-engine-quick-start
Changes the client_buffer to tensor, and server_buffer to gpu-tensor.
Finally got coredump...
So do we support npu->gpu currently ?
Before submitting a new issue...
- [x] Make sure you already searched for relevant issues and read the documentation
The current Mooncake typically doesn't support heterogeneous transports (e.g. sender with Ascend NPUs, receiver with NV GPUs).
Try this: https://github.com/kvcache-ai/Mooncake/blob/main/doc/zh/heterogeneous_ascend.md
@ShangmingCai could you help figure out why "--device_name=mlx5_1" is needed in cmd "./transfer_engine_heterogeneous_ascend_perf_initiator --mode=initiator ... --device_name=mlx5_1", is there mlx NIC on 910B server?
@ShangmingCai could you help figure out why "--device_name=mlx5_1" is needed in cmd "./transfer_engine_heterogeneous_ascend_perf_initiator --mode=initiator ... --device_name=mlx5_1", is there mlx NIC on 910B server?
@Vikram111-pix I assume it could be the config of H20 side.
@zuochunwei Can you help explain?
@ShangmingCai could you help figure out why "--device_name=mlx5_1" is needed in cmd "./transfer_engine_heterogeneous_ascend_perf_initiator --mode=initiator ... --device_name=mlx5_1", is there mlx NIC on 910B server?
@Vikram111-pix I assume it could be the config of H20 side.
@zuochunwei Can you help explain? @ShangmingCai thanks for your reply, in the link you pasted the target and initiator both need argument "--device_name=mlx5_1", and from the code of heterogeneous_ascend, it firstly need copy data from HBM to DRAM, then use the rdma transport to write data to remote target, so the mlx5_1 is passed to rdma transport, but i am not sure if is there any mellanox nic on 910B server。
@zuochunwei could you pls help clarify this? thanks