dify [chatglm] Server Unavailable Error, Internal Server Error

Self Checks

[X] I have searched for existing issues search for existing issues, including closed ones.
[X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[X] Pleas do not modify this template :) and fill in all the required fields.

Dify version

0.5.11

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

DIFY 接入 chatglm3 , 基础助手，可以正常对话。
搭建 stable diffusion ，可以正常出图
DIFY 智能助手接入 stable diffuion ，可以正常出图，但是出图后，马上图片消失，且报错[chatglm] Server Unavailable Error, Internal Server Error

✔️ Expected Behavior

No response

❌ Actual Behavior

No response

Apr 01 '24 11:04 zps-509

Hey @zps-509, good to see you back! Running into a bit of a quirky issue, huh? Let’s see what we can do about that.

Based on the information you've provided, it seems like the issue you're experiencing might be related to how the response from the stable diffusion process is being handled. The error message "[chatglm] Server Unavailable Error, Internal Server Error" suggests that there might be an issue with the server hosting the ChatGPT service, but it could also be a symptom of how the response is being processed.

A similar issue was discussed in the Completion Request Error thread. The solution suggested there was to differentiate between simple and streaming responses in your client code. For simple JSON responses, you can access the response body directly. However, for streaming responses, you should consume the stream properly instead of trying to access it as plain text. Here's how you can handle both types of responses correctly:

For Simple JSON Responses: If you know the response is a simple JSON object (dict in Python), you can access it as follows:
```
response = compact_response(some_dict)
json_data = response.get_json()  # Assuming Flask's `Response` object
```
For Streaming Responses: If the response is a stream (for example, when compact_response is called with a generator), you should consume the stream:
```
response = compact_response(some_generator)
if response.mimetype == 'text/event-stream':
    for line in response.response:
        # Process each line from the stream
        print(line)
```
Note: The actual method to consume the stream may vary depending on how you're making the request (e.g., using requests in Python, JavaScript's fetch API, etc.). The key point is that you should not attempt to directly access .text or equivalent on a streaming response.

If the issue persists, it might be worth checking the logs of each container to identify any errors. You can use the command docker compose logs <service_name> for each service, especially the api and worker services, as they are directly involved in processing requests. Also, ensure that all services are running correctly using docker compose ps and that Docker has enough resources allocated.

As for the role of the chatglm3 model in the Dify setup and its interaction with the stable diffusion process, I wasn't able to find specific information within the repository. However, it's likely that the chatglm3 model is used to process and generate the conversational responses, while the stable diffusion process is used for image generation. The issue seems to arise when these two processes interact, possibly due to how the response from the stable diffusion process is being handled.

I hope this helps! If you have any more information or if there's anything else you'd like me to look into, please let me know.

Sources

About Dosu

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

Apr 01 '24 11:04 dosubot[bot]

我刚刚在 https://udify.app/chat/VfbjY8kvFH76I3CA 配置同一个 chatglm 和同一个 stable diffusion成功了。可以生成图片，没有报错。。但是在私有化 docker 部署的里面还是一样，使用的同样的方法。会生成图片，但是一下就会报错，图片没了。

Apr 01 '24 13:04 zps-509

INFO:httpx:HTTP Request: GET http://192.168.133.98:20009/sdapi/v1/sd-models "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://192.168.133.98:55555/v1/chat/completions "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: GET http://192.168.133.98:20009/sdapi/v1/sd-models "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://192.168.133.98:55555/v1/chat/completions "HTTP/1.1 500 Internal Server Error" INFO:openai._base_client:Retrying request to /chat/completions in 0.957732 seconds INFO:httpx:HTTP Request: POST http://192.168.133.98:55555/v1/chat/completions "HTTP/1.1 500 Internal Server Error" INFO:openai._base_client:Retrying request to /chat/completions in 1.993984 seconds INFO:httpx:HTTP Request: POST http://192.168.133.98:55555/v1/chat/completions "HTTP/1.1 500 Internal Server Error" ERROR:core.application_manager:Unknown Error when generating Traceback (most recent call last): File "/app/api/core/model_runtime/model_providers/__base/large_language_model.py", line 96, in invoke result = self._invoke(model, credentials, prompt_messages, model_parameters, tools, stop, stream, user) File "/app/api/core/model_runtime/model_providers/chatglm/llm/llm.py", line 66, in _invoke return self._generate( File "/app/api/core/model_runtime/model_providers/chatglm/llm/llm.py", line 176, in _generate result = client.chat.completions.create( File "/usr/local/lib/python3.10/site-packages/openai/_utils/_utils.py", line 275, in wrapper return func(*args, **kwargs) File "/usr/local/lib/python3.10/site-packages/openai/resources/chat/completions.py", line 667, in create return self._post( File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1208, in post return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)) File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 897, in request return self._request( File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 973, in _request return self._retry_request( File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1021, in _retry_request return self._request( File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 973, in _request return self._retry_request( File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1021, in _retry_request return self._request( File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 988, in _request raise self._make_status_error_from_response(err.response) from None openai.InternalServerError: Internal Server Error

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/app/api/core/application_manager.py", line 173, in _generate_worker runner.run( File "/app/api/core/app_runner/assistant_app_runner.py", line 248, in run self._handle_invoke_result( File "/app/api/core/app_runner/app_runner.py", line 230, in _handle_invoke_result self._handle_invoke_result_stream( File "/app/api/core/app_runner/app_runner.py", line 263, in _handle_invoke_result_stream for result in invoke_result: File "/app/api/core/features/assistant_fc_runner.py", line 111, in run chunks: Union[Generator[LLMResultChunk, None, None], LLMResult] = model_instance.invoke_llm( File "/app/api/core/model_manager.py", line 70, in invoke_llm return self.model_type_instance.invoke( File "/app/api/core/model_runtime/model_providers/__base/large_language_model.py", line 111, in invoke raise self._transform_invoke_error(e) core.model_runtime.errors.invoke.InvokeServerUnavailableError: [chatglm] Server Unavailable Error, Internal Server Error

这个是 docker_api 的日志，请帮忙看下怎么处理

Apr 01 '24 13:04 zps-509

[root@localhost docker]$ curl -X POST "http://192.168.133.98:55555/v1/chat/completions" -H "Content-Type: application/json" -d "{"model": "chatglm3-6b", "messages": [{"role": "system", "content": ""}, {"role": "user", "content": "hello Word"}], "stream": false, "max_tokens": 100, "temperature": 0.8, "top_p": 0.8}"

{"model":"chatglm3-6b","object":"chat.completion","choices":[{"index":0,"message":{"role":"assistant","content":"Hello! How can I assist you today?","name":null,"function_call":null},"finish_reason":"stop"}],"created":1711979906,"usage":{"prompt_tokens":11,"total_tokens":23,"completion_tokens":12}}

Apr 01 '24 13:04 zps-509

This is most likely because the official demo of OpenAI server in the ChatGLM repo is not a productive solution, as it always raises errors, for a better experience, we recommend XInference as the inferencing service.

Apr 02 '24 10:04 Yeuoly

我切换成 XInference ，可以正常调取tools，没有报错。但是我的 “提示词”，感觉失效了。请问这个是什么回事呢？

Apr 05 '24 03:04 zps-509

Sounds weird, could you pls provide a bit more details about that?

Apr 05 '24 11:04 Yeuoly

我使用 chatglm3作为LLM，提示词里面说“你是一个客服，你叫XX”，然后跟他对话，问他‘你是谁’，他回复我“我是XX客服，有什么可以帮助你的”
我使用 XInference 里面的chatglm3作为LLM, 提示词一样，然后跟他对话，问他‘你是谁’，他回答我 ‘我是一个名为 ChatGLM3-6B 的人工智能助手’

以上操作对比，感觉使用 XInference之后，提示词失效了。

Apr 07 '24 07:04 zps-509

Hi, @zps-509,

I'm helping the Dify team manage their backlog and am marking this issue as stale. It looks like you encountered a "Server Unavailable Error, Internal Server Error" when using DIFY with chatglm3 and stable diffusion. There were some detailed responses from user dosubot suggesting potential solutions and troubleshooting steps, as well as a recommendation from user Yeuoly to use XInference as the inferencing service for a better experience. However, it seems that the issue remains unresolved despite the suggested actions.

Could you please confirm if this issue is still relevant to the latest version of the Dify repository? If it is, please let the Dify team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days. Thank you!

May 08 '24 16:05 dosubot[bot]

Please upgrade to latest version to see if this issue persists. :)

May 28 '24 07:05 crazywoola