RuntimeError: radix_sort: failed on 1st step: cudaErrorInvalidDevice: invalid device ordinal
Hello! Nice work!
I'm trying to get this running on an RTX 3090, I'm getting warnings when installing where its recommending that i launch with -std=c++14
Other than that I'm not seeing anything out of the ordinary. Has anyone else managed to get this running for newer rtx cards?
File "/opt/conda/envs/pytorch_venv/lib/python3.7/site-packages/torch/_tensor.py", line 255, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/opt/conda/envs/pytorch_venv/lib/python3.7/site-packages/torch/autograd/__init__.py", line 149, in backward
allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag
File "/opt/conda/envs/pytorch_venv/lib/python3.7/site-packages/torch/autograd/function.py", line 87, in apply
return self._forward_cls.backward(self, *args) # type: ignore[attr-defined]
File "/opt/conda/envs/pytorch_venv/lib/python3.7/site-packages/diffvg-0.0.1-py3.7-linux-x86_64.egg/pydiffvg/render_pytorch.py", line 709, in backward
eval_positions.shape[0])
RuntimeError: radix_sort: failed on 1st step: cudaErrorInvalidDevice: invalid device ordinal
Seeing a similar/related error on a Tesla K80:
/usr/local/lib/python3.7/dist-packages/diffvg-0.0.1-py3.7-linux-x86_64.egg/pydiffvg/render_pytorch.py in backward(ctx, grad_img)
707 use_prefiltering,
708 diffvg.float_ptr(eval_positions.data_ptr()),
--> 709 eval_positions.shape[0])
710 time_elapsed = time.time() - start
711 global print_timing
RuntimeError: radix_sort: failed on 1st step: cudaErrorInvalidDeviceFunction: invalid device function
Having the same issue as @josephrocca with a Tesla K80.
Same, I also do see it pop up more with later GPUs (RTX 30x0)
Please provide the code leading up to it. I need more context
Hi! I also ran into this problem. As I understand it, this is a compatibility issue. Changing this line https://github.com/BachiLi/diffvg/blob/e5955dbdcb4715ff3fc6cd7d74848a3aad87ec99/CMakeLists.txt#L23 to this: set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -std=c++14 -gencode=arch=compute_37,code=sm_37") for Tesla K80 on Google Colab helped me. -gencode=arch=compute_86,code=sm_86 for RTX 3090. -gencode=arch=compute_75,code=sm_75 for Tesla T4. Found info about matching CUDA arch here: https://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/
@IzhanVarsky Thank you so much!! I updated the install section of my notebooks that use diffvg with the following code, and now they work when Colab assigns me K80 machines.
%cd /content/
!git clone https://github.com/BachiLi/diffvg
%cd diffvg
import subprocess
if 'K80' in str(subprocess.check_output(['nvidia-smi', '-L'])):
!sed -i 's/set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -std=c++11")/set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -std=c++14 -gencode=arch=compute_37,code=sm_37")/' /content/diffvg/CMakeLists.txt
!git submodule update --init --recursive
!python setup.py install