server icon indicating copy to clipboard operation
server copied to clipboard

Model Management

Open N-Kingsley opened this issue 1 year ago • 1 comments

Can I specify a specific version to load or upload when using triton-inference-server for model management?

I only found the following two APIs: Load model: v2/repository/models/{model-name}/load Upload model: v2/repository/models/{model-name}/upload

N-Kingsley avatar May 16 '24 06:05 N-Kingsley

@N-Kingsley , I'm not a mantainer, but the way to address this at least for the /load endpoint is by adding the model versions you require in your config.pbtxt version policy https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/user_guide/model_configuration.html#version-policy. For the unload use case, not sure how this should be handled.

juanma9613 avatar May 16 '24 19:05 juanma9613