AndyZZt
AndyZZt
定位到错误日志为fateflow服务一直没有启动,/data/logs/fate/fateflow/error.log 重复报错: Traceback (most recent call last): File "/data/projects/fate/fateflow/python/fate_flow/fate_flow_server.py", line 27, in from fate_flow import set_env File "/data/projects/fate/fateflow/python/fate_flow/__init__.py", line 21, in from backports.datetime_fromisoformat import MonkeyPatch ModuleNotFoundError: No module named 'backports.datetime_fromisoformat'...
@ShaoYULi12 如果用的是云服务器,可能是gRPC服务没有开启,提工单看看,然后按照给的脚本从头到尾重新执行一遍
> IFB on LLaMA series models are supported. Do you encounter any issue? When comparing the performance of TRT-LLM with other inference frameworks, I found that TRT's performance is poor...
> > > IFB on LLaMA series models are supported. Do you encounter any issue? > > > > > > When comparing the performance of TRT-LLM with other inference...
Hi Kuntai,我看了您目前对代码的改动,想跟您确认一下目前是没有实际进行KV Cache传输的对吗,也就是roadmap 1所说的KV cache transfer can be done immediately。通过调整输出长度=1、跳过prefill,直接起两个server各自模拟分离后各自的耗时? 另外似乎您是想通过模型并行来达到PD分离的目的,但是我理解这会给PD使用不同并行设置的情况带来麻烦,还是我理解错误了呢? 感谢您的答复!