dify Applications of the agent type couldn't stream return results when xinference was used as the model provider

Self Checks

[X] This is only for bug report, if you would like to ask a question, please head to Discussions.
[X] I have searched for existing issues search for existing issues, including closed ones.
[X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[X] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:）
[X] Please do not modify this template :) and fill in all the required fields.

Dify version

0.6.11

Cloud or Self Hosted

Self Hosted (Source)

Steps to reproduce

Create an application with the type "agent."
Set the model to be provided by xinference.
Engage in a conversation using the application.

✔️ Expected Behavior

❌ Actual Behavior

Jul 04 '24 06:07 liuzhenghua

The issue you're experiencing is due to the fact that the current implementation of xinference does not support streaming mode. This is confirmed by the comments in the test file and the code in the llm.py file:

"""
    Function calling of xinference does not support stream mode currently
"""

if stream:
    if tools and len(tools) > 0:
        raise InvokeBadRequestError('xinference tool calls does not support stream mode')

Additionally, the BaseAgentRunner class includes a check to determine if the model supports streaming tool calls, which further confirms that xinference does not support this feature:

# check if model supports stream tool call
llm_model = cast(LargeLanguageModel, model_instance.model_type_instance)
model_schema = llm_model.get_model_schema(model_instance.model, model_instance.credentials)
if model_schema and ModelFeature.STREAM_TOOL_CALL in (model_schema.features or []):
    self.stream_tool_call = True
else:
    self.stream_tool_call = False

Therefore, the inability to stream return results when using xinference as the model provider is a known limitation ^[1]^[2]^[3].

To continue talking to Dosu, mention @dosu.

Jul 04 '24 06:07 dosubot[bot]

i test the llm by xinference ,it can stream return results

Jul 05 '24 05:07 EnzoLiang

i test the llm by xinference ,it can stream return results

Did you create an app of the agent type?

Jul 05 '24 06:07 liuzhenghua