Yiming

Results 4 issues of Yiming

实现了基于 length + body 的自定义协议,使用 rpc press 压测发现,下游只有一个节点时,时耗稳定并且耗时很小,如果增加下游节点(使用 rr 模式),耗时99线明显上升。 下游节点机器、性能、网络均大致相同,正常情况下处理时耗较短,并且耗时高的 ip 均匀分布。 **以下是 1 个下游节点的测试情况:** =========================================================================== 2022/07/04-15:38:47 sent:3001 success:3002 error:0 total_error:0 total_sent:261105 2022/07/04-15:38:48 sent:3001 success:3001 error:0 total_error:0 total_sent:264106 2022/07/04-15:38:49...

This doesn't seem to work /v2/models/ensemble/generate ``` { "text_input": "...", "parameters": { "id": "123", }, "sequence_id": "456" } ``` verbose log: ![image](https://github.com/triton-inference-server/server/assets/10678668/cd186b4b-96b6-4d98-93a8-c14ea99629fc)

investigating

![Uploading 1725936272689.jpg…]()

Currently, when NixlConnector is used in MultiConnector, the NixlConnectorScheduler and NixlConnectorWorker on the same node will be set to different engine_ids, resulting in the corresponding engine_id being unable to be...