llmaz icon indicating copy to clipboard operation
llmaz copied to clipboard

feat(controller): support serverless serving with keda support by k8s scale subresource.

Open X1aoZEOuO opened this issue 4 months ago • 10 comments

What this PR does / why we need it

Detailed Explanation of Commit

This commit introduces a guide for configuring serverless environments on Kubernetes, focusing on integrating Prometheus for monitoring and KEDA for autoscaling. The guide aims to optimize resource efficiency through event-driven scaling while maintaining observability for AI/ML workloads.

  • Prometheus Integration: Configured with namespaceSelector for cross-namespace monitoring
  • KEDA Autoscaling: Custom metric scaling with Prometheus triggers
  • Scale-to-Zero: Activator pattern with request buffering and CloudEvents

Which issue(s) this PR fixes

Fixes #

Special notes for your reviewer

Does this PR introduce a user-facing change?


cc @pacoxu @kerthcet

X1aoZEOuO avatar Sep 28 '25 11:09 X1aoZEOuO

/kind feature

X1aoZEOuO avatar Sep 28 '25 12:09 X1aoZEOuO

@pacoxu @googs1025 @carlory @kerthcet Hello all! Could you spare a few minutes to review my PRs when you have a chance?

Other ref PRs:

  • https://github.com/InftyAI/llmaz/pull/499
  • https://github.com/InftyAI/llmaz/pull/498

X1aoZEOuO avatar Sep 29 '25 13:09 X1aoZEOuO

/assign I will take a look this week or early next week.

pacoxu avatar Oct 09 '25 05:10 pacoxu

@pacoxu @kenwoodjw Friendly ping, do you have some time to take a look at my PRs? Thanks a lot for your assistance!

X1aoZEOuO avatar Oct 15 '25 10:10 X1aoZEOuO

/assign

kerthcet avatar Oct 27 '25 09:10 kerthcet

seems some docs are duplicated with https://github.com/InftyAI/llmaz/pull/499/files, can we just put one here and refer to it in another one.

@kerthcet Thank you for catching this! I've refactored the documentation structure to eliminate duplication, Now focuses specifically, and reference link to the main serverless documentation (PR #499)

X1aoZEOuO avatar Oct 29 '25 17:10 X1aoZEOuO

The test is always failing ...

kerthcet avatar Oct 30 '25 18:10 kerthcet

/retest

pacoxu avatar Oct 31 '25 03:10 pacoxu

@pacoxu I have resolved the conflict. :)

X1aoZEOuO avatar Oct 31 '25 04:10 X1aoZEOuO

/retest

X1aoZEOuO avatar Oct 31 '25 04:10 X1aoZEOuO