TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

fix: segfault in cudaDriverWrapper

Open hypdeb opened this issue 1 year ago • 9 comments

Symbol cuGetErrorMessage does not exist. Trying to access it after it's been dynamically loaded causes a segfault. This results in kernel launch failures being hidden as segfaults.

hypdeb avatar Mar 24 '25 09:03 hypdeb

/bot run

hypdeb avatar Mar 24 '25 09:03 hypdeb

PR_Github #277 [ run ] triggered by Bot

niukuo avatar Mar 24 '25 09:03 niukuo

PR_Github #277 [ run ] completed with state FAILURE /LLM/main/L0_MergeRequest_PR pipeline #267 completed with status: 'FAILURE'

niukuo avatar Mar 24 '25 10:03 niukuo

/bot run

hypdeb avatar Mar 25 '25 09:03 hypdeb

PR_Github #408 [ ] completed with state FAILURE

tensorrt-cicd avatar Mar 25 '25 09:03 tensorrt-cicd

PR_Github #411 [ ] completed with state FAILURE

tensorrt-cicd avatar Mar 25 '25 09:03 tensorrt-cicd

PR_Github #415 [ ] completed with state FAILURE

tensorrt-cicd avatar Mar 25 '25 09:03 tensorrt-cicd

PR_Github #419 [ run ] triggered by Bot

niukuo avatar Mar 25 '25 09:03 niukuo

PR_Github #419 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #361 completed with status: 'SUCCESS'

niukuo avatar Mar 25 '25 17:03 niukuo

/bot run

hypdeb avatar Apr 01 '25 11:04 hypdeb

PR_Github #898 [ run ] triggered by Bot

tensorrt-cicd avatar Apr 01 '25 11:04 tensorrt-cicd

PR_Github #898 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #706 completed with status: 'SUCCESS'

tensorrt-cicd avatar Apr 01 '25 18:04 tensorrt-cicd

/bot reuse-pipeline

hypdeb avatar Apr 02 '25 06:04 hypdeb

PR_Github #976 [ reuse-pipeline ] triggered by Bot

tensorrt-cicd avatar Apr 02 '25 06:04 tensorrt-cicd

PR_Github #976 [ reuse-pipeline ] completed with state SUCCESS Reusing PR_Github #898 for commit b9b847a

tensorrt-cicd avatar Apr 02 '25 06:04 tensorrt-cicd