Support for stream: false via extra-openai-models.yaml

Open vividfog opened this issue 2 years ago • 1 comments

It seems that register_model() in openai_models.py doesn't currently expect a stream variable to be in the .yaml file, and can_stream gets set to True by default.

Many organizations use an internal OpenAI-compatible API proxy or gateway to access OpenAI and control the keys. For this, extra-openai-models.yaml is the easy way to make proxies just work. However, currently there doesn't seem to be a mechanism to pass the equivalent of --no-stream via this .yaml. All OpenAI-compatible gateways are expected to be able to stream.

This makes --no-stream an obligatory option at CLI runtime with quick PoC API gateways, that for example implement key sharing or team-level usage limits for the org.

I've tried writing a streaming API proxy myself, and turns out streaming is not that trivial to implement. Non-streaming proxies are easy, so I suspect many teams in their hurry start with that. There are some streaming-capable proxy projects in Github, but I think it'd be logical if stream: false or can_stream: false via extra-openai-models.yaml was passed downstream to execute().

Or is there a better, already-supported way to do this? Such as model default options saved by the user? A bit like aliases are. I do admit that "how to call a model" and "what are its default options" isn't quite the same level question to define in the .yaml, design-wise.

Sep 19 '23 16:09 vividfog

While waiting for this configuration option to be implemented, I've been using this wrapper script as a work-around:

#!/usr/bin/env python3

"""A wrapper script for using the ``llm`` command line interface.

The script makes sure that streaming is disabled.
The script then runs the ``llm`` command line interface.
You can copy and rename this script, or even alias it as ``llm`` in your shell.

Make sure your custom model is defined in ``extra-openai-models.yaml``
You can look up the correct location for the file with::

    llm logs path

"""

import llm.cli

MY_MODEL_ID = "my-custom-gpt-4o"


def main():
    """Run the ``llm`` command line interface enforcing no streaming."""
    models = llm.get_model_aliases()
    models[MY_MODEL_ID].can_stream = False
    llm.cli.get_model_aliases = lambda: models
    llm.cli.cli()


if __name__ == "__main__":
    main()

Aug 20 '24 05:08 akaihola