llmaz icon indicating copy to clipboard operation
llmaz copied to clipboard

Is there any early proposal or document about integrating with Gateway API ?

Open caozhuozi opened this issue 1 year ago • 2 comments

I came across the roadmap and am particularly interested in the Gateway API section. Will Llamz support advanced traffic management features, such as shadow and canary deployments between different model services? If so, could you share how you plan to implement this?

Thanks in advance!

caozhuozi avatar Sep 15 '24 15:09 caozhuozi

shadow and canary deployments between different model services

Thanks for you concern. The TL;DR is Yes, but no idea yet

I think it's a vital feature for production. Gateway API here means a bunch of things, like token/lora/model related service, canary deployments can also be part of them (maybe later we'll sort them clearly). And what llmaz usually does is we'll have a minimal implementation for out-of-box support, but we'll also provide project integrations considering people usually have lots of projects in their cluster, we don't want to increase the maintenance burden for them. Regarding to canary deployments, there maybe argo workflow, istio, so they're all in plan I think.

About the minimal implementation, I haven't thought too much about that, and we have a bunch of higher priority tasks on hand.

/kind feature

kerthcet avatar Sep 18 '24 03:09 kerthcet

Hi @kerthcet! Really thanks for your great pacience and detailed replay! ❤

caozhuozi avatar Sep 18 '24 15:09 caozhuozi

In favor of https://github.com/InftyAI/llmaz/issues/339. /close

kerthcet avatar Apr 21 '25 02:04 kerthcet