test fails after successful "pip install ." command
I have a GPU:
nvidia-smi
| NVIDIA-SMI 535.86.10 Driver Version: 535.86.10 CUDA Version: 12.2 |
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
| 0 NVIDIA L40 On | 00000000:04:00.0 Off | 0 |
| N/A 29C P8 33W / 300W | 4MiB / 46068MiB | 0% Default |
| | | N/A |
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
I have executed:
git clone <extension...>
cd extension_cpp
pip install .
python test/test_extension.py
and I get all 8 tests failing:
Fail to import hypothesis in common_utils, tests are not derandomized
EEEEEEEE
======================================================================
ERROR: test_opcheck_cpu (__main__.TestMyAddOut)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/torch/testing/_internal/optests/generate_tests.py", line 660, in opcheck
tester(op, args, kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/testing/_internal/optests/generate_tests.py", line 60, in safe_schema_check
result = op(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/_ops.py", line 667, in __call__
return self_._op(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/_subclasses/schema_check_mode.py", line 156, in __torch_dispatch__
out = func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/_ops.py", line 667, in __call__
return self_._op(*args, **kwargs)
NotImplementedError: Could not run 'extension_cpp::myadd_out' with arguments from the 'CPU' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'extension_cpp::myadd_out' is only available for these backends: [HIP, Meta, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradMPS, AutogradXPU, AutogradHPU, AutogradLazy, AutogradMeta, Tracer, AutocastCPU, AutocastXPU, AutocastCUDA, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].
HIP: registered at extension_cpp/csrc/cuda/muladd.cu:82 [kernel]
Meta: registered at ../aten/src/ATen/core/MetaFallbackKernel.cpp:23 [backend fallback]
BackendSelect: fallthrough registered at ../aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback]
Python: registered at ../aten/src/ATen/core/PythonFallbackKernel.cpp:153 [backend fallback]
FuncTorchDynamicLayerBackMode: registered at ../aten/src/ATen/functorch/DynamicLayer.cpp:497 [backend fallback]
Functionalize: registered at ../aten/src/ATen/FunctionalizeFallbackKernel.cpp:349 [backend fallback]
Named: registered at ../aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback]
Conjugate: registered at ../aten/src/ATen/ConjugateFallback.cpp:17 [backend fallback]
Negative: registered at ../aten/src/ATen/native/NegateFallback.cpp:18 [backend fallback]
ZeroTensor: registered at ../aten/src/ATen/ZeroTensorFallback.cpp:86 [backend fallback]
ADInplaceOrView: fallthrough registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:86 [backend fallback]
AutogradOther: registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:53 [backend fallback]
AutogradCPU: registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:57 [backend fallback]
AutogradCUDA: registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:65 [backend fallback]
AutogradXLA: registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:69 [backend fallback]
AutogradMPS: registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:77 [backend fallback]
AutogradXPU: registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:61 [backend fallback]
AutogradHPU: registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:90 [backend fallback]
AutogradLazy: registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:73 [backend fallback]
AutogradMeta: registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:81 [backend fallback]
Tracer: registered at ../torch/csrc/autograd/TraceTypeManual.cpp:297 [backend fallback]
AutocastCPU: fallthrough registered at ../aten/src/ATen/autocast_mode.cpp:209 [backend fallback]
AutocastXPU: fallthrough registered at ../aten/src/ATen/autocast_mode.cpp:351 [backend fallback]
AutocastCUDA: fallthrough registered at ../aten/src/ATen/autocast_mode.cpp:165 [backend fallback]
FuncTorchBatched: registered at ../aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:731 [backend fallback]
BatchedNestedTensor: registered at ../aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:758 [backend fallback]
FuncTorchVmapMode: fallthrough registered at ../aten/src/ATen/functorch/VmapModeRegistrations.cpp:27 [backend fallback]
Batched: registered at ../aten/src/ATen/LegacyBatchingRegistrations.cpp:1075 [backend fallback]
VmapMode: fallthrough registered at ../aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]
FuncTorchGradWrapper: registered at ../aten/src/ATen/functorch/TensorWrapper.cpp:207 [backend fallback]
PythonTLSSnapshot: registered at ../aten/src/ATen/core/PythonFallbackKernel.cpp:161 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at ../aten/src/ATen/functorch/DynamicLayer.cpp:493 [backend fallback]
PreDispatch: registered at ../aten/src/ATen/core/PythonFallbackKernel.cpp:165 [backend fallback]
PythonDispatcher: registered at ../aten/src/ATen/core/PythonFallbackKernel.cpp:157 [backend fallback]
More errors...
This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
----------------------------------------------------------------------
Ran 8 tests in 0.161s
FAILED (errors=8)
I have the same problem
I have the same problem
I also have same problem on windows, but on ubuntu it's working
Hey did anyone find a solution to this?
I have solved the question 'ERROR: test_opcheck_cpu (main.TestMyAddOut)' by creating a new environment, and upgrading the NVIDIA driver to the newest version, as well as using PyTorch 2.5.0 with CUDA 12.4 and Python 3.12.
but I get a new question:
return self._op(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ NotImplementedError: Could not run 'extension_cpp::myadd_out' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process ( if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'extension_cpp::myadd_out' is only available for these backends: [CPU, Meta, Back endSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradMPS, AutogradXPU, AutogradHPU, AutogradLaz y, AutogradMeta, Tracer, AutocastCPU, AutocastXPU, AutocastMPS, AutocastCUDA, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].
I have solved the question 'ERROR: test_opcheck_cpu (main.TestMyAddOut)' by creating a new environment, and upgrading the NVIDIA driver to the newest version, as well as using PyTorch 2.5.0 with CUDA 12.4 and Python 3.12. but I get a new question:
return self._op(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ NotImplementedError: Could not run 'extension_cpp::myadd_out' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process ( if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'extension_cpp::myadd_out' is only available for these backends: [CPU, Meta, Back endSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradMPS, AutogradXPU, AutogradHPU, AutogradLaz y, AutogradMeta, Tracer, AutocastCPU, AutocastXPU, AutocastMPS, AutocastCUDA, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].
my gpu is 3090ti
@jia-heng I also ran into this issue, and in my case it was due to pytorch not being able to automatically find where cuda is installed. In setup.py you can see that if the variable CUDA_HOME from torch.utils.cpp_extension is None, then cuda is not used. This may happen if your cuda is installed in an unusual location, as in my case. You can work around it by directly setting the environmental variable CUDA_HOME to the location of your cuda install.
For me, I have some problems with the version torch 2.4.0 build from the source, so I decide to upgrade the version by re-installing libtorch and the latest version pytorch 2.5.1. that's solved the problem.
I had the issue with torch 2.5.0 and tried it with different cuda versions where none worked but torch 2.4.1+cu124 with Cuda 12.4.0 built from source works on Ubuntu 24.04.