dify icon indicating copy to clipboard operation
dify copied to clipboard

Wrong max tokens limit for OpenRouter Sonnet 3.5 in Chat/Agent apps

Open infinitnet opened this issue 1 year ago • 1 comments

Self Checks

  • [X] This is only for bug report, if you would like to ask a question, please head to Discussions.
  • [X] I have searched for existing issues search for existing issues, including closed ones.
  • [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [X] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • [X] Please do not modify this template :) and fill in all the required fields.

Dify version

0.9.1

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

Set up chat or agent app with Sonnet 3.5 through OpenRouter. Max tokens will be 4096 although the config has been adjusted according to bug report #9067 in pull request #9068 in api/core/model_runtime/model_providers/openrouter/llm/claude-3-5-sonnet.yaml ... However, this has not fixed the max_tokens for agent and chat apps for some reason - just for LLM nodes.

✔️ Expected Behavior

max_tokens should be 8192

❌ Actual Behavior

max_tokens is 4096

infinitnet avatar Oct 08 '24 09:10 infinitnet

A similar issue was reported in issue #9067, which was closed by pull request #9068 on October 08, 2024. However, it seems the fix did not apply to Chat/Agent apps, only to LLM nodes [1].

The configuration for OpenRouter Sonnet 3.5 specifies a max token limit of 4096, which might be overriding other settings [2]. You may need to check if there are any specific configurations for Chat/Agent apps that could be affecting this, such as those in chatglm3-6b-32k.yaml, which allows a range up to 32000 tokens [3].

To continue talking to Dosu, mention @dosu.

dosubot[bot] avatar Oct 08 '24 09:10 dosubot[bot]

For some reason my pull request for #9067 did not work as expected either. Sonnet 3.5 through OpenRouter is still limited to 4096 tokens, although the config file has been corrected. It's as if it's pulling the max_tokens limit from elsewhere. Does anyone have an idea?

infinitnet avatar Oct 11 '24 13:10 infinitnet

Works in 0.9.2 - not sure why it didn't work before with locally applied changes.

infinitnet avatar Oct 14 '24 09:10 infinitnet