codingl2k1
codingl2k1
> ok I tried to copy the data to each node and it worked, but at the same time two other problems occurred: > > 1. Can the read_json of...
This is a Ray error. The Ray actor has crashed. Possible root causes: 1. The worker ran out of memory (Ray OOM monitor killed the actor when the free memory...
Please refer to OpenAI's Chat API https://platform.openai.com/docs/api-reference/chat/streaming, the stream does not return token usage: ```json {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-3.5-turbo-0125", "system_fingerprint": "fp_44709d6fcb", "choices":[{"index":0,"delta":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}]} {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-3.5-turbo-0125", "system_fingerprint": "fp_44709d6fcb", "choices":[{"index":0,"delta":{"content":"Hello"},"logprobs":null,"finish_reason":null}]} .... {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-3.5-turbo-0125", "system_fingerprint": "fp_44709d6fcb", "choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]} ``` Related...
https://github.com/facebookresearch/llama/issues/380
I have created a PR to langchain for supporting xinference chat with stream: https://github.com/langchain-ai/langchain/pull/12702.
Similar issue: https://github.com/moymix/TaskMatrix/issues/116
I am using the latest master. Qwen 2 7b and Qwen 1.5 7b are from the latest huggingface. The stream result can't be parsed as a tool call result, so...
> > modelscope 上的模型不是官方放的 > > Yes. You should always follow the official model at HuggingFace. Thanks. Is HuggingFace the only official model?
是的,目前 glm4 vllm 后端的 tool call还没实现。有兴趣贡献一下 fix 吗?