Stary
Stary
> I'm not sure if it's because this is still a work in progress, but in the current PR it looks like the doc files were simply moved into the...
> Another question. Will we continue to maintain `doc/zh` in the future? I think we just retain a few necessary documents in chinese is enough.
When IBV_ACCESS_RELAXED_ORDERING is set, RDMA write-after-write message order is no longer guaranteed, I'm not sure if it is not impact on us.
> > When IBV_ACCESS_RELAXED_ORDERING is set, RDMA write-after-write message order is no longer guaranteed, I'm not sure if it is not impact on us. > > @staryxchen Based on our...
Hi @alogfans Can we support multiple transport within a single transfer engine? This would enable us to select the optimal transport for different type transfer task. For example, if a...
> Transfer Engine does not perceive the upper-layer LLM request, so you should measure per-request network latency on the inference side. However, the latency can be recorded for each put/get...
> > > Transfer Engine does not perceive the upper-layer LLM request, so you should measure per-request network latency on the inference side. However, the latency can be recorded for...
@stmatengss I have implemented the functionality to report task completion delay distributions in the PR #1130 , PTAL
Hi @JayFzh I'm not very familiar with Ray, but Mooncake transfer engine doesn't care whether vllm uses MP or Ray and support P2P transfer between any node. I recommend confirming...
I have an idea: to reduce the effort of writing documents in both Chinese and English, we could introduce AI to assist with synchronization. After that, when updates are needed,...