Estella

Results 5 comments of Estella

> 问题解决,是因为NTP服务不准时导致,同步一下NTP,问题得到解决 同步了依然存在问题 也是centos7

@Oldpan internvl2-2B 跑起来 推理总是输出max_token数,这是为什么

end_id =2没错的 经测试,run.py --stop_words "" 可以解决

> @Amrabdelhamed611 the first error was not reported by flashinfer, that might be related to flashattn package and flashinfer doesn't depend on that. > > Regarding the second issue, check...

I got the same error. nvcr.io/nvidia/tritonserver:24.03-py3 tensorrt-llm 0.10.0 Qwen2-1.5B-instruct