NextChat icon indicating copy to clipboard operation
NextChat copied to clipboard

optional max_tokens

Open Algorithm5838 opened this issue 1 year ago • 6 comments

Use a checkbox to optionally enable the use of max_tokens instead of having it disabled. This feature is useful for OpenAI models, as well as models from OpenRouter and other platforms. I've set the default to 2048 for smaller context models (4k); however, 4096 is the preferred setting for newer models from OpenAI and Anthropic. Despite these models supporting much larger contexts, their output is capped at 4096.

Algorithm5838 avatar Mar 27 '24 05:03 Algorithm5838

@Algorithm5838 is attempting to deploy a commit to the NextChat Team on Vercel.

A member of the Team first needs to authorize it.

vercel[bot] avatar Mar 27 '24 05:03 vercel[bot]

Your build has completed!

Preview deployment

github-actions[bot] avatar Mar 27 '24 05:03 github-actions[bot]

Use a checkbox to optionally enable the use of max_tokens instead of having it disabled. This feature is useful for OpenAI models, as well as models from OpenRouter and other platforms. I've set the default to 2048 for smaller context models (4k); however, 4096 is the preferred setting for newer models from OpenAI and Anthropic. Despite these models supporting much larger contexts, their output is capped at 4096.

@Algorithm5838 Just letting you know, there is a bug related to the attach messages feature due to the max_tokens setting in this chat.ts file. The logic needs to be refactored because the way attach messages work is not consistent, depending on the max_tokens value.

H0llyW00dzZ avatar Mar 28 '24 19:03 H0llyW00dzZ

related issue:

  • #4303

H0llyW00dzZ avatar Mar 28 '24 19:03 H0llyW00dzZ

You are correct. I encountered it before and solved it by commenting out this part:

          i >= contextStartIndex;// && tokenCount < maxTokenThreshold;

The issue with the logic is that they assumed max_tokens is input + output, where it is actually output only. The right way is to include context tokens with the models.

Algorithm5838 avatar Mar 28 '24 19:03 Algorithm5838

You are correct. I encountered it before and solved it by commenting out this part:

          i >= contextStartIndex;// && tokenCount < maxTokenThreshold;

The issue with the logic is that they assumed max_tokens is input + output, where it is actually output only. The right way is to include context tokens with the models.

I figured that out a few weeks ago when trying to implement support for anthropic with my friends.

H0llyW00dzZ avatar Mar 28 '24 20:03 H0llyW00dzZ