ZelinTan
ZelinTan
> Are there any plans to optimize long context latency? I am interested in contributing to P-D split inference architechure and I have machines that support me to develop the...
> > @lumiere-ml @tanzelin430 Are you in the slack channel? We can follow up on that. > > @lumiere-ml @tanzelin430 Welcome to join our slack channel https://join.slack.com/t/sgl-fru7574/shared_invite/zt-2ngly9muu-t37XiH87qvD~6rVBTkTEHw thanks for invitation,...
> I am also very interested in the scenario of PD disaggregation, and I hope to combine radix tree with PD disaggregation for some experiments. I saw that someone mentioned...
Hi Chenyang, I’d love to test Simon’s new branch! I now have 8 * 4090D GPUs, which don’t support P2P. Not sure if this disadvantage will have any significant impact...