server icon indicating copy to clipboard operation
server copied to clipboard

I got the same problem on 21.11 and 21.12, it works with the single model or a couple of models, but triton never releases them.

Open wangzz313 opened this issue 2 years ago • 4 comments

          I got the same problem on 21.11 and 21.12, it works with the single model or a couple of models, but triton never releases them. 

Ensemble model: Python backend(cpu) + onnx model(GPU)

python model: instance_group [ { kind: KIND_CPU } ]

model_warmup [{}]

response_cache { enable: True }

onnx model: instance_group [ { kind: KIND_GPU } ]

model_warmup [{}]

response_cache { enable: True

Originally posted by @alicimertcan in https://github.com/triton-inference-server/server/issues/3761#issuecomment-1018443038

wangzz313 avatar Mar 18 '24 09:03 wangzz313

cc @GuanLuo @rmccorm4 @jbkyang-nvi

lkomali avatar Mar 18 '24 21:03 lkomali

Can you provide more information. Is this latest version of triton? If not can you try with the latest version 24.02

indrajit96 avatar Mar 18 '24 21:03 indrajit96

Hi @wangzz313, as @indrajit96 suggested, have you tried the newer version of triton? 21.11 and 21.12 are quite old. Unfortunately, 24.02 version does not come with onnxruntime backend, so please try 24.01

oandreeva-nv avatar Mar 20 '24 17:03 oandreeva-nv

@oandreeva-nv we're facing the same issue with 24.01 and 24.08 (cc: @susnato)

rishabhmehrotra avatar Sep 11 '24 15:09 rishabhmehrotra