llmaz
llmaz copied to clipboard
Liveness & Readiness support
Add the support for inference services.
/kind feature /milestone v0.1.0
Also StartupProbe? See https://github.com/triton-inference-server/server/pull/5257/.
Yes, something like that, the core reason here is we should be aware of the server condition, ready or not? Maybe this can be part of the backendRuntime because it's related to the backend themselves.