lagent May I know once I want to make a LLM by `BaseAPIModel`, where could I input the url

I cannot find where can I input my url of my LLM on cloud. I should make one just like you do in GPTAPI, shoudn't I? So why this class called BaseAPIModel?

class BaseAPIModel(BaseModel):
    """Base class for API model wrapper.

    Args:
        model_type (str): The type of model.
        query_per_second (int): The maximum queries allowed per second
            between two consecutive calls of the API. Defaults to 1.
        retry (int): Number of retires if the API call fails. Defaults to 2.
        meta_template (Dict, optional): The model's meta prompt
            template if needed, in case the requirement of injecting or
            wrapping of any meta instructions.
    """
    pass

Feb 21 '24 07:02 ZhaoCake

What is the model you use? The BaseAPIModel is mostly used for closed LLM, which is only used by API. If you use LMDeploy for develop your own model, you can refer LMDeployClient

Feb 22 '24 07:02 Harold-lkk

Thanks for your reply. But I want say that when we use API to deploy our model, that means the device in our hand may with no enough calculate resouces to run a LLM, such as without a Nvidia GPU. And LMDeployClient is relying on lmdeploy, which cannot access in such devices. It's clearly that this class just use a little part of lmdeploy, may you move them out for lagent? I 'll really appreciate it if there will be a new class for API in openai style. Yeah, that's what I face. When I deploy a model with api_server on a GPU by lmdeploy, I can't access it in lagent without lmdeploy on my client machine, that's why I want to make one by BaseAPIModel.

Feb 22 '24 07:02 ZhaoCake

Actually even if your local device does have GPU, you can install lmdeploy either

Feb 22 '24 08:02 Harold-lkk

If you truly do not want to install lmdeploy in your local device, you can implement a class like

class RequestPostModel(BaseModel):
    def chat():
       pload = {
            k: v
            for k, v in locals().copy().items()
            if k[:2] != '__' and k not in ['self']
        }
        response = requests.post(self.chat_completions_v1_url,
                                 headers=self.headers,
                                 json=pload,
                                 stream=stream)
        for chunk in response.iter_lines(chunk_size=8192,
                                         decode_unicode=False,
                                         delimiter=b'\n'):
            if chunk:
                if stream:
                    decoded = chunk.decode('utf-8')
                    if decoded == 'data: [DONE]':
                        continue
                    if decoded[:6] == 'data: ':
                        decoded = decoded[6:]
                    output = json_loads(decoded)
                    yield output
                else:
                    decoded = chunk.decode('utf-8')
                    output = json_loads(decoded)
                    yield output

which copy from lmdeploy. The class only use request lib to access your API model

Feb 22 '24 08:02 Harold-lkk

I just find you pull a PR for this, thanks for your contribution

Feb 22 '24 08:02 Harold-lkk

Thanks for your advice. But lmdeploy relies on torch, it will make a large of unnecessary download task and Occupation, especially in arm devices😂. That is.

Feb 22 '24 08:02 ZhaoCake

May I know once I want to make a LLM by `BaseAPIModel`, where could I input the url_base?