server
server copied to clipboard
Model Management
Can I specify a specific version to load or upload when using triton-inference-server for model management?
I only found the following two APIs: Load model: v2/repository/models/{model-name}/load Upload model: v2/repository/models/{model-name}/upload
@N-Kingsley , I'm not a mantainer, but the way to address this at least for the /load endpoint is by adding the model versions you require in your config.pbtxt version policy https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/user_guide/model_configuration.html#version-policy. For the unload use case, not sure how this should be handled.