tricky

Results 10 comments of tricky

> An update, since this issue has been open for a long time. My model learned to attend, kinda. It still has issues during inference and I've been playing around...

> Which GPU are you using? By the way, there are running warnings bellow, does the error have relation with the warning? I cannot find this file in my system....

> The recently released quant training of lightseq layers only supports A100 or compute capability greater than 80, because it uses real int8 gemm instead of fake quantization. If you...

no I copyed several directory and build separately.

or how does different model-repository do parallel computing on one gpu?

> No, the Python Backend should be the same. Does the 24.05-py3 contain backends of ONNX、TensorRT and TorchScript?

> @tricky61 The `nvcr.io/nvidia/tritonserver:24.05-py3` container contains ONNX, TRT and PyTorch backends. The `nvcr.io/nvidia/tritonserver:24.05-trtllm-python-py3` only has TRTLLM and Python backends. ok. I am using nvcr.io/nvidia/tritonserver:24.05-vllm-python-py3 I also use tritonserver:23.11 and add...

> You can upload the trtexec --verbose build log. thanks for your reply. since it is difficult to send files out from my company's computer, I will try to upload...

> As suggested by [Jokeren](https://github.com/Jokeren), storing the temporary values to the global memory and then reload from it with latest triton version is working on V100. which version will work?...