PaddleDetection icon indicating copy to clipboard operation
PaddleDetection copied to clipboard

目标跟踪报错:cudaErrorLaunchFailure (719)

Open XuLei0 opened this issue 3 years ago • 3 comments

问题确认 Search before asking

  • [X] 我已经搜索过问题,但是没有找到解答。I have searched the question and found no related answer.

请提出你的问题 Please ask your question

预测所有的 目标跟踪 模型都会报: OSError: (External) CUDA error(719), unspecified launch failure. [Hint: 'cudaErrorLaunchFailure'. An exception occurred on the device while executing a kernel. Common causes include dereferencing an invalid device pointerand accessing out of bound s shared memory. Less common cases can be system specific - more information about these cases canbe found in the system specific user guide. This leaves the process in an inconsistent state and any further CUDA work willreturn the same error. To continue using CUDA, the process must be terminated and relaunched.] (at ..\paddle\phi\backends\gpu\gpu_context.cc:435) [operator < multiclass_nms3 > error]

XuLei0 avatar Jul 28 '22 07:07 XuLei0

请先检查paddle版本是否安装正确,先试试检测模型如ppyoloe的预测命令能否运行顺利。

nemonameless avatar Jul 28 '22 09:07 nemonameless

ppyoloe、行人检测、车辆检测都可以运行,目标跟踪GPU就是不行,报上面的错,但是cpu可以运行

------------------ 原始邮件 ------------------ 发件人: "PaddlePaddle/PaddleDetection" @.>; 发送时间: 2022年7月28日(星期四) 下午5:06 @.>; @.@.>; 主题: Re: [PaddlePaddle/PaddleDetection] 目标跟踪报错:cudaErrorLaunchFailure (719) (Issue #6535)

请先检查paddle版本是否安装正确,先试试检测模型如ppyoloe的预测命令能否运行顺利。

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

XuLei0 avatar Jul 28 '22 13:07 XuLei0

运行命令: CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/jde/jde_darknet53_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams --video_file=E:/PaddleDetection/dataset/my_test_data/videos/MOT16_small.mp4  --save_videos

W0728 21:37:06.386767 18188 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.6, Runtime API Version: 11.6 W0728 21:37:06.401937 18188 gpu_resources.cc:91] device: 0, cuDNN Version: 8.4. [07/28 21:37:08] ppdet.utils.checkpoint INFO: Finish resuming model weights: C:\Users\xulei/.cache/paddle/weights\jde_darknet53_30e_1088x608.pdparams [07/28 21:37:08] ppdet.data.source.mot INFO: Length of the video: 160 frames. [07/28 21:37:08] ppdet.engine.tracker INFO: Starting tracking video E:/PaddleDetection/dataset/my_test_data/videos/MOT16_small.mp4   0%|                                                                                                                                                          | 0/160 [00:02<?, ?it/s] Traceback (most recent call last):   File "tools/infer_mot.py", line 150, in <module>     main()   File "tools/infer_mot.py", line 146, in main     run(FLAGS, cfg)   File "tools/infer_mot.py", line 100, in run     tracker.mot_predict_seq(   File "E:\PaddleDetection\ppdet\engine\tracker.py", line 544, in mot_predict_seq     results, nf, ta, tc = self._eval_seq_jde(   File "E:\PaddleDetection\ppdet\engine\tracker.py", line 147, in _eval_seq_jde     pred_dets, pred_embs = self.model(data)   File "D:\Miniconda\envs\Paddle\lib\site-packages\paddle\fluid\dygraph\layers.py", line 930, in call     return self._dygraph_call_func(*inputs, **kwargs)   File "D:\Miniconda\envs\Paddle\lib\site-packages\paddle\fluid\dygraph\layers.py", line 915, in _dygraph_call_func     outputs = self.forward(*inputs, **kwargs)   File "E:\PaddleDetection\ppdet\modeling\architectures\meta_arch.py", line 75, in forward     outs.append(self.get_pred())   File "E:\PaddleDetection\ppdet\modeling\architectures\jde.py", line 110, in get_pred     return self._forward()   File "E:\PaddleDetection\ppdet\modeling\architectures\jde.py", line 68, in _forward     det_outs = self.detector(self.inputs)   File "D:\Miniconda\envs\Paddle\lib\site-packages\paddle\fluid\dygraph\layers.py", line 930, in call     return self._dygraph_call_func(*inputs, **kwargs)   File "D:\Miniconda\envs\Paddle\lib\site-packages\paddle\fluid\dygraph\layers.py", line 915, in _dygraph_call_func     outputs = self.forward(*inputs, **kwargs)   File "E:\PaddleDetection\ppdet\modeling\architectures\meta_arch.py", line 75, in forward     outs.append(self.get_pred())   File "E:\PaddleDetection\ppdet\modeling\architectures\yolo.py", line 128, in get_pred     return self._forward()   File "E:\PaddleDetection\ppdet\modeling\architectures\yolo.py", line 99, in _forward     boxes_idx, bbox, bbox_num, nms_keep_idx = self.post_process(   File "D:\Miniconda\envs\Paddle\lib\site-packages\paddle\fluid\dygraph\layers.py", line 930, in call     return self._dygraph_call_func(*inputs, **kwargs)   File "D:\Miniconda\envs\Paddle\lib\site-packages\paddle\fluid\dygraph\layers.py", line 915, in _dygraph_call_func     outputs = self.forward(*inputs, **kwargs)   File "E:\PaddleDetection\ppdet\modeling\post_process.py", line 459, in forward     bbox_pred, bbox_num, nms_keep_idx = self.nms(   File "E:\PaddleDetection\ppdet\modeling\layers.py", line 488, in call     return ops.multiclass_nms(bboxes, score, **kwargs)   File "E:\PaddleDetection\ppdet\modeling\ops.py", line 725, in multiclass_nms     helper.append_op(   File "D:\Miniconda\envs\Paddle\lib\site-packages\paddle\fluid\layer_helper.py", line 44, in append_op     return self.main_program.current_block().append_op(*args, **kwargs)   File "D:\Miniconda\envs\Paddle\lib\site-packages\paddle\fluid\framework.py", line 3599, in append_op     _dygraph_tracer().trace_op(type,   File "D:\Miniconda\envs\Paddle\lib\site-packages\paddle\fluid\dygraph\tracer.py", line 307, in trace_op     self.trace(type, inputs, outputs, attrs, OSError: (External) CUDA error(719), unspecified launch failure.   [Hint: 'cudaErrorLaunchFailure'. An exception occurred on the device while executing a kernel. Common causes include dereferencing an invalid device pointerand accessing out of bound s shared memory. Less common cases can be system specific - more information about these cases canbe found in the system specific user guide. This leaves the process in an inconsistent  state and any further CUDA work willreturn the same error. To continue using CUDA, the process must be terminated and relaunched.] (at ..\paddle\phi\backends\gpu\gpu_context.cc:435)   [operator < multiclass_nms3 > error]

------------------ 原始邮件 ------------------ 发件人: "PaddlePaddle/PaddleDetection" @.>; 发送时间: 2022年7月28日(星期四) 下午5:06 @.>; @.@.>; 主题: Re: [PaddlePaddle/PaddleDetection] 目标跟踪报错:cudaErrorLaunchFailure (719) (Issue #6535)

请先检查paddle版本是否安装正确,先试试检测模型如ppyoloe的预测命令能否运行顺利。

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

XuLei0 avatar Jul 28 '22 13:07 XuLei0