dify icon indicating copy to clipboard operation
dify copied to clipboard

The answers to the same question are confusing

Open EnzoLiang opened this issue 1 year ago • 2 comments

Self Checks

  • [X] This is only for bug report, if you would like to ask a question, please head to Discussions.
  • [X] I have searched for existing issues search for existing issues, including closed ones.
  • [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [X] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • [X] Please do not modify this template :) and fill in all the required fields.

Dify version

0.6.11

Cloud or Self Hosted

Self Hosted (Source)

Steps to reproduce

I creat an agent and add a math tool to it, then selected qwen llm deployed by xinference for testing. When I first asked the following question, the bot experienced a hallucination, and upon clicking on the Agent log, an "provider type not found" prompt appeared in the top right corner. However, when I asked the same question again, it responded normally. lyz

✔️ Expected Behavior

No response

❌ Actual Behavior

No response

EnzoLiang avatar Jul 02 '24 06:07 EnzoLiang

@EnzoLiang Hello there! I'm here to help with bugs, answer questions, and assist you while waiting for a human maintainer. Let's squash those bugs together! 😉

The "provider type not found" error on the first attempt but normal response on the second attempt could be due to an issue with the model configuration or the implementation of the model invocation code. Specifically, it might be related to how the model type is defined and initialized in the YAML configuration and the corresponding Python code.

Ensure that the model type is correctly specified in the YAML configuration file and that the model invocation code properly handles the initialization and invocation of the model. Here is a checklist to verify:

  1. YAML Configuration: Ensure that the model type and other properties are correctly defined in the YAML file.

    model: qwen-llm  # Model identifier
    label:
      en_US: Qwen LLM
    model_type: llm  # Model type
    features:
    - agent-thought
    model_properties:
      mode: chat
      context_size: 200000
    parameter_rules:
    - name: temperature
      use_template: temperature
    - name: top_p
      use_template: top_p
    - name: top_k
      type: int
      required: false
    - name: max_tokens_to_sample
      use_template: max_tokens
      default: 4096
      min: 1
      max: 4096
    pricing:
      input: '8.00'
      output: '24.00'
      unit: '0.000001'
      currency: USD
    
  2. Model Invocation Code: Ensure that the model invocation code in llm.py is correctly implemented and handles both synchronous and streaming responses properly.

    def _invoke(self, model: str, credentials: dict,
                prompt_messages: list[PromptMessage], model_parameters: dict,
                tools: Optional[list[PromptMessageTool]] = None, stop: Optional[List[str]] = None,
                stream: bool = True, user: Optional[str] = None) \
            -> Union[LLMResult, Generator]:
        if stream:
            return self._handle_stream_response(model, credentials, prompt_messages, model_parameters, tools, stop, user)
        return self._handle_sync_response(model, credentials, prompt_messages, model_parameters, tools, stop, user)
    
    def _handle_stream_response(self, model: str, credentials: dict, prompt_messages: list[PromptMessage],
                                model_parameters: dict, tools: Optional[list[PromptMessageTool]] = None,
                                stop: Optional[List[str]] = None, user: Optional[str] = None) -> Generator:
        for chunk in response:
            yield chunk
    
    def _handle_sync_response(self, model: str, credentials: dict, prompt_messages: list[PromptMessage],
                              model_parameters: dict, tools: Optional[list[PromptMessageTool]] = None,
                              stop: Optional[List[str]] = None, user: Optional[str] = None) -> LLMResult:
        return LLMResult(**response)
    
  3. Error Mapping: Ensure that the error mapping is correctly implemented to handle different types of invocation errors.

    @property
    def _invoke_error_mapping(self) -> dict[type[InvokeError], list[type[Exception]]]:
        return {
            InvokeConnectionError: [ConnectionError],
            InvokeServerUnavailableError: [ServerUnavailableError],
            InvokeRateLimitError: [RateLimitError],
            InvokeAuthorizationError: [AuthorizationError],
            InvokeBadRequestError: [BadRequestError]
        }
    

Additionally, ensure that the Xinference service is correctly deployed and accessible. Here are the steps to verify:

  1. Deploy Xinference: Ensure that Xinference is installed and started correctly.

    $ pip install "xinference[all]"
    $ xinference-local
    
  2. Access Endpoint: Verify that the Xinference service is running and accessible at the specified endpoint (e.g., http://127.0.0.1:9997).

  3. Network Configuration: If using Docker, ensure that the network configuration allows the Dify container to access the Xinference endpoint. The Dify container cannot access localhost inside; use the host IP address instead.

  4. Model Deployment: Ensure that the model is correctly deployed in Xinference and that you have the correct model UID.

  5. Dify Configuration: In Dify, configure the Xinference model provider with the correct model name, server URL, and model UID.

By ensuring that the model type is correctly defined, the invocation code is properly implemented, and the Xinference service is correctly deployed and accessible, you can address the "provider type not found" error and the hallucination issue [1][2].

To continue talking to Dosu, mention @dosu.

dosubot[bot] avatar Jul 02 '24 06:07 dosubot[bot]

@dosu Thank you for your reply, but I don't think that's the reason. When answering other questions, it was normal. Here is a screenshot of the specific error lyz

EnzoLiang avatar Jul 02 '24 06:07 EnzoLiang