dify icon indicating copy to clipboard operation
dify copied to clipboard

bug: max_token limit exceeded when change llm

Open Weaxs opened this issue 1 year ago • 1 comments

Self Checks

  • [X] This is only for bug report, if you would like to ask a question, please head to Discussions.
  • [X] I have searched for existing issues search for existing issues, including closed ones.
  • [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [X] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • [X] Please do not modify this template :) and fill in all the required fields.

Dify version

0.7.3

Cloud or Self Hosted

Cloud

Steps to reproduce

  1. choose GTP-4o-mini
  2. set max_token to 16384, save
  3. change model to siliconflow meta-3.1-405B-Instruct
  4. run workflow
  5. llm node occured error

image image

✔️ Expected Behavior

No response

❌ Actual Behavior

No response

Weaxs avatar Sep 05 '24 08:09 Weaxs

I found two similar discussions that might be relevant to your issue:

  1. Max Token Limits Incorrect? - This discussion is still open and was last updated on August 14, 2024.
  2. Error when I modify the max_tokens of the custom Maximum chunk length from 1000 to 2000 - This discussion is also open and was last updated on May 4, 2024 [1][2].

Regarding your specific issue, the max_tokens parameter for the meta-llama/Meta-Llama-3.1-40GB-Instruct model is indeed set to a high value in the configuration interface, but the actual operational limit is 4096 tokens. This discrepancy can occur due to various reasons such as implementation constraints, API limitations, or practical performance considerations that are not reflected in the configuration interface [3][4].

The default value for the max_tokens parameter in the Dify configuration for the meta-llama/Meta-Llama-3.1-40GB-Instruct model is 512, with a maximum allowed value of 4096 [5].

The validation for the max_tokens parameter is implemented in the api/core/model_runtime/model_providers/openrouter/llm/llama-3.1-8b-instruct.yaml file, where it is restricted to a maximum of 4096 tokens [6][7].

If you need further assistance, you might want to check the configuration file api/core/model_runtime/model_providers/openrouter/llm/llama-3.1-405b-instruct.yaml for detailed settings and limitations [8][9].

To continue talking to Dosu, mention @dosu.

dosubot[bot] avatar Sep 05 '24 08:09 dosubot[bot]