LocalAI feature: identify model file by SHA

If the API would support model aliases that would allow to plug into existing webui more easily, without supporting directly each one of them. Seems all the UI out there filter by known models - even if the api returns the supported models

Apr 19 '23 16:04 mudler

maybe we can pass the model via the token? However, won't spare from UI filtering what /models actually returns.

Apr 19 '23 23:04 mudler

A config file, could be:

-  model: my-alpaca
   // file: alpaca.bin
   checksum:
   - sha256:....
   - sha1:....
   top_p: ..
   temperature: ...
   url:
   - url-to-fetch-model
   stop_words:
   - <SYSTEM>
   alias: 
   - gpt3
   template:
       completion: file.tmpl
       chat: file.tmpl
-  model: my-cerebras
   // file: cerebras.bin
   checksum:
   - sha256:....
   - sha1:....
   top_p: ..
   temperature: ...
   stop_words:
   - <SYSTEM>
   alias: 
   - gpt4
   template:
       completion: file.tmpl
       chat: file.tmpl

Apr 22 '23 21:04 mudler

Another example:

models:
  - name: ggml-gpt4all-j
    checksum:
      - sha256: ...
    path: /models/ggml-gpt4all-j.bin
    backend: gpt4all-j
    defaults:
      top_p: ...
      stop_words: ...
      prompts:
        - completion: ...
        - chat: ...
    aliases:
      - gpt-3.5-turbo
      - gpt4

  - name: cerebras-2.7b-q4_0
    checksum:
      - sha256: ...
    path: /models/cerebras-2.7b-q4_0.bin
    backend: gpt2
    defaults:
      top_p: ...
      stop_words: ...
      prompts:
        - completion: ...
        - chat: ...
    aliases:
      - gpt2

You could even point to the same model sha/path, and give it a different name and use different prompts. this allows the maximum of flexibility per users needs.

Apr 22 '23 23:04 mkellerman

As talked about on Discord, my proposition is this below.

Takes into consideration

Separating models from configurations
- use one configuration for many models
- use one model for multiple things (per API path or model name)
- use different models for different things (same as above)
Considering the fact that we emulate the OpenAI API and have limited info at the server endpoint
- only "model" and API path is known
Reduce boilerplate from repeating prompts etc for each model file

Folder structure:

│
├─models
│ │
│ └─mymodel.bin
│
└─configs
  │
  ├─generic_completion.yaml
  │
  └─stableml.yaml

Content of one config yaml:

name: generic_completion
shas:
  - 81237981273
  - 1283128937
paths:
  - v1/chat
  - v1/completion
model_names:
  - gpt-3.5-turbo
  - gpt-4
(backend: gpt4all-j)?
top_p: ...
stop_words:
  - <|USER|>
  - <|SYSTEM|>
completion_prompt: Complete the following text.
chat_prompt: This is the discussion of an AI and a Human...
chat_system: <|SYSTEM|>
chat_user: <|USER|>
chat_assistant: <|ASSISTANT|>

How it works When a request comes in:

Model name and API path is known (as per the request + JSON)
Configs are scanned for combination of path and name
If the combination is found, the list of sha's is checked
We look for a fitting model in memory or disk
- if none is found -> back two steps
We apply the config and run the inference
Ofc all this should be buffered

User Side

There could be default config files with known-to-work shas.
- These would auto-configure any known bin files in the models folder
If a user-created config.yaml file overrides a path/name/sha combination it is used instead.
The default configs could be updated from the interwebs every now and then

The user can for example..

Copy a default config.yaml and adapt it for their needs
Change the prompt or use a different model (add a sha)
Add a different model name so the configuration can be used via the OpenAI API request

Edit: Alternatively model_name and config.yaml name could be conflated to make it a bit simpler but less flexible (i.e. generic_completion to gpt-4).

Edit2: Actually the backend entry doesn't really fit in there as one might be able to use the config with a different backend.. I suggest that the backend should be determined when the sha's for the existing files are scanned..

Edit3: As the user might find it difficult to get a sha for his model file we might add the option for the sha to be a path?

Apr 22 '23 23:04 normen

pivoted this to a specific action item since we have now a "base" configuration mechanism and a wide refactor was needed before jumping into it. Now we can just build up from the current implementation. What I've did is simply explosed the parameters available from the OpenAI endpoint into parameters in the model block. The model block I've kept it very simple for now - just specifying the name, and the model file it will run the association, but no heavy business logic.

Apr 27 '23 16:04 mudler

:warning::warning::warning::warning::warning:

Hi! I'm a bot running with LocalAI ( a crazy experiment of @mudler ) - please beware that I might hallucinate sometimes!

but.... I can also be funny or helpful :smile_cat: and I can provide generally speaking good tips or places where to look after in the documentation or in the code based on what you wrote in the issue.

Don't engage in conversation with me, I don't support (yet) replying!

:warning::warning::warning::warning::warning:

To add model aliases, you can create a file in the /models directory called "aliases.json" with the following content: { "bert": { "alias": "whisper", "description": "A model for audio transcription" } } This will allow users to install the "whisper" model by passing its alias instead of the original name. The UI can then use this file to display the correct model name and description in the installation prompt.

Sources:

https://localai.io/advanced/index.html
https://github.com/go-skynet/LocalAI/tree/master/main.go
https://localai.io/models/index.html

Jul 18 '23 23:07 localai-bot

this lost of interest and no-one is actually looking at this. Closing

Jan 23 '24 22:01 mudler