Increase the output window tokens for known-good enterprise models

Open philipp-spiess opened this issue 1 year ago • 1 comments

This is a workaround until we have proper support for https://github.com/sourcegraph/sourcegraph/pull/61885

We have recently increased out output token lengths for PLG and now Enterprise is lagging behind. Certain Enterprise setups are getting in our way to apply the same defaults (see this thread for context).

The workaround is to only allow a hand full of recommended models to be using the extended token windows. Some use the Cody Gateway format (with the provider prefix) and only point to Cody Gateway configs and some use the Bedrock format (which contain the prefix with a . and a special version identifier at the end) and should thus only affect Bedrock users. Furthermore I did intentionally only include recent flagship models and not things like Claude 2, to avoid issues with older configurations.

Test plan

Connect to the demo instance and see the longer result window in the output tab
Got a response of ~11k chars which is 2.2k GPT4 tokens so probably over 1k Anthropic tokens

May 17 '24 22:05 philipp-spiess

Test plan: worked for me on s2.

On latest build, signed into s2, output channel shows the request has "maxTokensToSample": 1000,`
On this branch, signed into s2 (after I changed s2's model to be Sonnet), output channel shows "maxTokensToSample": 1000,`
- and I got a response in chat with 3839 chars, which is probably over 1000 tokens

May 17 '24 22:05 sqs