Wrong max tokens limit for OpenRouter Sonnet 3.5 in Chat/Agent apps
Self Checks
- [X] This is only for bug report, if you would like to ask a question, please head to Discussions.
- [X] I have searched for existing issues search for existing issues, including closed ones.
- [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [X] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
- [X] Please do not modify this template :) and fill in all the required fields.
Dify version
0.9.1
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
Set up chat or agent app with Sonnet 3.5 through OpenRouter. Max tokens will be 4096 although the config has been adjusted according to bug report #9067 in pull request #9068 in api/core/model_runtime/model_providers/openrouter/llm/claude-3-5-sonnet.yaml ... However, this has not fixed the max_tokens for agent and chat apps for some reason - just for LLM nodes.
✔️ Expected Behavior
max_tokens should be 8192
❌ Actual Behavior
max_tokens is 4096
A similar issue was reported in issue #9067, which was closed by pull request #9068 on October 08, 2024. However, it seems the fix did not apply to Chat/Agent apps, only to LLM nodes [1].
The configuration for OpenRouter Sonnet 3.5 specifies a max token limit of 4096, which might be overriding other settings [2]. You may need to check if there are any specific configurations for Chat/Agent apps that could be affecting this, such as those in chatglm3-6b-32k.yaml, which allows a range up to 32000 tokens [3].
To continue talking to Dosu, mention @dosu.
For some reason my pull request for #9067 did not work as expected either. Sonnet 3.5 through OpenRouter is still limited to 4096 tokens, although the config file has been corrected. It's as if it's pulling the max_tokens limit from elsewhere. Does anyone have an idea?
Works in 0.9.2 - not sure why it didn't work before with locally applied changes.