AnnaYue
AnnaYue
+1, if support cuda 11.8+ would be better
have met same error, it will happen when I set batch_size > 8
have you tried nvidia-smi in container to make sure nvidia-container-runtime is running as expected? `docker run xxx --entrypoint nvidia-smi nvcr.io/nvidia/tritonserver:24.04-trtllm-python-py3`
nvcc in container is using prebuilt binary in the image, that's ok. Seem like there is another process running and using the 1/2 GPU cards, maybe could stop it and...