John
John
同样的问题,已经解决了吗?
> 同样问题 老哥双卡能加载vicuna13吗?最终问题如何解决? 我这边 单卡 能跑,双卡显然也能跑
> A lazy way to solve this is to add a line in fastchat.serve.openai_api_server.py line 233 with `conv["messages"] = []` after `conv = await get_conv(model_name)` 好像不能解决该问题
openai.error.APIError: Invalid response object from API: '{"object":"error","message":"This model\'s maximum context length is 2048 tokens. However, you requested 2228 tokens (1716 in the messages, 512 in the completion). Please reduce the...
另外,如果使用的本地知识库是 程序自带的 samples,是可以work的。
> > > 当我开启fastchat的vicuna-13b的api服务,然后config那里配置好(api本地测试过可以返回结果),然后知识库加载好之后(知识库大概有1000多个文档,用chatGLM可以正常推理),进行问答时出现token超过限制,就问了一句hello; > > > 错误号如下:openai.error.APIError: Invalid response object from API: '{"object":"error","message":"This model's maximum context length is 2048 tokens. However, you requested 2359 tokens (1847 in the messages,...
Note: This problem will occur in the K8S container, but it will not appear when running directly on the physical machine
Thank you for your concerns and questions. We've tested the codec on the dataset "40 - year human settlement changes in China (Urban and rural data)" which can be accessed...