CUDA ERROR
Environment
- OS (e.g., Linux): Ubuntu Server 22.04
- FunASR Version (e.g., 1.0.0): 1.1.12
- ModelScope Version (e.g., 1.11.0): 1.19.2
- PyTorch Version (e.g., 2.0.0): 2.5
- How you installed funasr (
pip, source): pip install funasr - Python version: 3.9
- GPU (e.g., V100M32) 2080Ti
- CUDA/cuDNN version (e.g., cuda11.7): cuda 12.8
- Docker version (e.g., funasr-runtime-sdk-cpu-0.4.1)
- Any other relevant information:
Additional context
在ubuntu中运行funasr进行模型推理,报下面错误。
CUDA error: an illegal memory access was encountered\nCUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1\nCompile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.\n')
Traceback (most recent call last):
File "/root/miniconda3/envs/asr/lib/python3.9/site-packages/celery/app/trace.py", line 453, in trace_task
R = retval = fun(*args, **kwargs)
File "/root/miniconda3/envs/asr/lib/python3.9/site-packages/celery/app/trace.py", line 736, in __protected_call__
return self.run(*args, **kwargs)
File "/usr/local/src/new_asr_spk_api/top/lukeewin/asr/service/asr_service.py", line 56, in asr_task
raise self.retry(exc=e, countdown=5, max_retries=2)
File "/root/miniconda3/envs/asr/lib/python3.9/site-packages/celery/app/task.py", line 743, in retry
raise_with_context(exc)
File "/usr/local/src/new_asr_spk_api/top/lukeewin/asr/service/asr_service.py", line 32, in asr_task
result = asr.model.generate(audio_path, is_final=True, batch_size_s=300)
File "/root/miniconda3/envs/asr/lib/python3.9/site-packages/funasr/auto/auto_model.py", line 303, in generate
return self.inference_with_vad(input, input_len=input_len, **cfg)
File "/root/miniconda3/envs/asr/lib/python3.9/site-packages/funasr/auto/auto_model.py", line 376, in inference_with_vad
res = self.inference(
File "/root/miniconda3/envs/asr/lib/python3.9/site-packages/funasr/auto/auto_model.py", line 342, in inference
res = model.inference(**batch, **kwargs)
File "/root/miniconda3/envs/asr/lib/python3.9/site-packages/funasr/models/fsmn_vad_streaming/model.py", line 712, in inference
speech = speech.to(device=kwargs["device"])
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
请问有人遇到过吗?如何解决?看报错好像出自vad模型推理。
gpu id设置错误
gpu id设置错误
有多张显卡,我是通过CUDA_VISIBLE_DEVICES方式指定使用哪张显卡,是这样设置使用gpu id错误吗?比如使用索引为1的显卡,我是在执行python代码前加入CUDA_VISIBLE_DEVICES=1,然后偶发会出现上面的cuda kernel error报错,当转写请求的音频短并且数量少的时候触发的概率小,当转写的音频时长长,比如5分钟以上,并且要转写的音频文件数量多的时候会触发cuda kernel error。
This "CUDA error: an illegal memory access was encountered" typically indicates GPU memory corruption or invalid pointer access. Based on your environment (V100M32, CUDA 12.8, PyTorch 2.5), here are diagnostic steps:
Immediate Troubleshooting:
-
Verify GPU health: Run
nvidia-smito check for GPU errors or ECC errors -
Check CUDA memory: Use
torch.cuda.memory_summary()before the crash to identify memory leaks - Reduce batch size: The V100 32GB should handle most workloads, but try reducing batch size to isolate if it's OOM-related
- Test with different precision: Try FP16 or mixed precision to reduce memory footprint
Common Causes for FunASR:
- Audio tensor size mismatch: Ensure input audio tensors are correctly shaped and on the right device
- Model checkpoint corruption: Re-download the model weights
- CUDA/PyTorch version mismatch: PyTorch 2.5 with CUDA 12.8 should work, but verify compatibility
- Docker memory constraints: Check if Docker has GPU memory limits that conflict with CUDA allocation
Debug Command:
CUDA_LAUNCH_BLOCKING=1 python your_script.py
This will provide the exact line where the illegal memory access occurs.
If the error persists, share the output of torch.cuda.get_device_properties(0) and the specific FunASR model you're using.