omer-dayan
omer-dayan
I agree with @klueska about how DRA is the right way. However, @elgalu, I do not agree its not feasible under the current circumstances. You welcome to watch the following...
My use case is KPA. I want to be able to scale a statefulset, which is not kNativeService, using KPA
Hey @gilljon ! From the temporary directory I can understand you store the model files in S3, right? We pull all the metadata files from the S3 to a tmp...
Hey, At RunAI we had published an open source tool to stream model weights from an object store like S3 to GPU memory - called RunAI Model Streamer (https://github.com/run-ai/runai-model-streamer) The...