llmaz icon indicating copy to clipboard operation
llmaz copied to clipboard

Support traditional models

Open kerthcet opened this issue 1 year ago • 1 comments

What would you like to be added:

Right now, llmaz is mostly designed for large language models, however, some users may need to support traditional models as a singleton solution, let's wait for some feedbacks.

References:

  • Kserve: https://kserve.github.io/website/latest/modelserving/v1beta1/serving_runtime/
  • Seldon: https://docs.seldon.io/projects/seldon-core/en/latest/nav/config/servers.html

The solution is quite similar, we have to implement the server runtime just like vllm for different kinds of models, or reuse the official ones like torchserve.

Why is this needed:

Completion requirements:

This enhancement requires the following artifacts:

  • [x] Design doc
  • [x] API change
  • [x] Docs update

The artifacts should be linked in subsequent comments.

kerthcet avatar Sep 09 '24 02:09 kerthcet

/kind feature

kerthcet avatar Sep 09 '24 02:09 kerthcet