RuntimeError: CUDA error: an illegal instruction was encountered when runing test.py
Hello,
When running python test.py I get the error :
===================================== ERROR: test_groups (main.Test)
Traceback (most recent call last): File "/fsx/mohamed/dev/marlin/test.py", line 155, in test_groups self.run_problem(m, n, k, *thread_shape, groupsize) File "/fsx/mohamed/dev/marlin/test.py", line 66, in run_problem torch.cuda.synchronize() File "/admin/home/mohamed_mekkouri/miniconda3/envs/exp/lib/python3.10/site-packages/torch/cuda/init.py", line 792, in synchronize return torch._C._cuda_synchronize() RuntimeError: CUDA error: an illegal instruction was encountered Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
======================================= ERROR: test_k_stages_divisibility (main.Test)
Traceback (most recent call last): File "/fsx/mohamed/dev/marlin/test.py", line 80, in test_k_stages_divisibility self.run_problem(16, 2 * 256, k, 64, 256) File "/fsx/mohamed/dev/marlin/test.py", line 60, in run_problem A = torch.randn((m, k), dtype=torch.half, device=DEV) RuntimeError: CUDA error: an illegal instruction was encountered Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
======================================== ERROR: test_tiles (main.Test)
Traceback (most recent call last): File "/fsx/mohamed/dev/marlin/test.py", line 75, in test_tiles self.run_problem(m, 2 * 256, 1024, thread_k, thread_n) File "/fsx/mohamed/dev/marlin/test.py", line 60, in run_problem A = torch.randn((m, k), dtype=torch.half, device=DEV) RuntimeError: CUDA error: an illegal instruction was encountered Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
=========================================== ERROR: test_very_few_stages (main.Test)
Traceback (most recent call last): File "/fsx/mohamed/dev/marlin/test.py", line 85, in test_very_few_stages self.run_problem(16, 2 * 256, k, 64, 256) File "/fsx/mohamed/dev/marlin/test.py", line 60, in run_problem A = torch.randn((m, k), dtype=torch.half, device=DEV) RuntimeError: CUDA error: an illegal instruction was encountered Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
Ran 6 tests in 0.794s
FAILED (errors=4)
the stack i am using : python 3.10.14 torch 2.3.1 cuda_12.1.r12.1 compute_cap 9.0
It looks like you are on Hopper because of compute_cap 9.0. There is a known issue with Marlin on Hopper GPUs
Yes it's Hopper, thank you !