[BUG] Visual studio fails to compile unit tests/examples
Describe the bug CMake generates a solution that fails to compile 11 of 20 projects on VS2022.
From #147 :
The problem is not that nvcc is being passed an incorrect flag, but rather that fvisibility is not valid on VS. We use the option -forward-unknown-to-host-compiler, so any unknown parameter (of which this is one), nvcc will automatically forward to VS.
To Reproduce Steps to reproduce the behavior:
cmake -DMATX_BUILD_TESTS=ON -DMATX_BUILD_BENCHMARKS=ON -DMATX_BUILD_EXAMPLES=ON -DMATX_BUILD_DOCS=OFF -DCMAKE_CUDA_ARCHITECTURES=52 -DCMAKE_BUILD_TYPE=Debug ..
Expected behavior Expect to successfully compile all unit tests & examples.
Code snippers output log attached.
System details (please complete the following information):
- Windows 10 Pro
- CMake 3.22.1
- VS2022 (MSVC 19.30.30706.0)
- CUDA 11.6
- pybind11 2.6.2
@akfite I've made a quick attempt at this and fixed a few things along the way, but it's still showing lots of compiler errors. I'll keep looking at it, but it could take a while.
@akfite, I think the number of errors are so numerous that it's going to take a significant amount of time to get this compiling on VS. MSVC has a lot of problems with SFINAE and some C++17 features, which we rely on heavily. I will leave this open, but until we have the time or more use cases, it's likely going to be lower priority than other features.
If anyone more familiar with the MSVC bugs would like to take a shot at it, we're open to PRs as well.
@cliffburdick All good. If you can figure it out someday that would be awesome, but I think I can at least use your code as reference in the meantime. I wish I could fix it myself but I'm way out of my depth here.
Thank you for looking into it!
Is WSL supported?
hi @sahuang, I was able to compile and run on wsl 2 with cuda installed.
hi @sahuang, I was able to compile and run on wsl 2 with cuda installed. hi @cliffburdick ,when i compile on wls2,a lot of error 255 like below: Environment: Windows 11 Cmake 3.27.3 CUDA 12.2.1 pybind11 v2.6.2 cmake -DMATX_BUILD_TESTS=ON -DMATX_BUILD_BENCHMARKS=ON -DMATX_BUILD_EXAMPLES=ON -DMATX_BUILD_DOCS=OFF .. -DCMAKE_CUDA_ARCHITECTURES="80;86" -- Using GPU architectures 80;86 -- Need libcuda++. Finding... -- CPM: adding package [email protected] (2.1.0) -- Enabling pybind11 support -- CPM: adding package [email protected] (v2.6.2) CMake Deprecation Warning at build/_deps/pybind11-src/CMakeLists.txt:8 (cmake_minimum_required): Compatibility with CMake < 3.5 will be removed from a future version of CMake.
Update the VERSION argument
-- pybind11 v2.6.2 CMake Warning (dev) at build/_deps/pybind11-src/tools/FindPythonLibsNew.cmake:98 (find_package): Policy CMP0148 is not set: The FindPythonInterp and FindPythonLibs modules are removed. Run "cmake --help-policy CMP0148" for policy details. Use the cmake_policy command to set the policy and suppress this warning.
Call Stack (most recent call first): build/_deps/pybind11-src/tools/pybind11Tools.cmake:45 (find_package) build/_deps/pybind11-src/tools/pybind11Common.cmake:201 (include) build/_deps/pybind11-src/CMakeLists.txt:188 (include) This warning is for project developers. Use -Wno-dev to suppress it.
-- checking python import module numpy
-- checking python import module cupy
Traceback (most recent call last):
File "
-- NVBench CUDA architectures: 80;86 -- CPM: adding package [email protected] (release-1.11.0) CMake Deprecation Warning at build/_deps/gtest-src/CMakeLists.txt:4 (cmake_minimum_required): Compatibility with CMake < 3.5 will be removed from a future version of CMake.
Update the VERSION argument
CMake Deprecation Warning at build/_deps/gtest-src/googlemock/CMakeLists.txt:45 (cmake_minimum_required): Compatibility with CMake < 3.5 will be removed from a future version of CMake.
Update the VERSION argument
CMake Deprecation Warning at build/_deps/gtest-src/googletest/CMakeLists.txt:56 (cmake_minimum_required): Compatibility with CMake < 3.5 will be removed from a future version of CMake.
Update the VERSION argument
-- Configuring done (8.8s) CMake Warning at build/_deps/nvbench-src/exec/CMakeLists.txt:1 (add_executable): Cannot generate a safe runtime search path for target nvbench.ctl because files in some directories may conflict with libraries in implicit directories:
runtime library [libnvidia-ml.so.1] in /usr/local/cuda-12.2/targets/x86_64-linux/lib/stubs may be hidden by files in:
/usr/lib/wsl/lib
Some of these libraries may not be found correctly.
CMake Warning at bench/CMakeLists.txt:22 (add_executable): Cannot generate a safe runtime search path for target matx_bench because files in some directories may conflict with libraries in implicit directories:
runtime library [libnvidia-ml.so.1] in /usr/local/cuda-12.2/targets/x86_64-linux/lib/stubs may be hidden by files in:
/usr/lib/wsl/lib
Some of these libraries may not be found correctly.
-- Generating done (0.3s) -- Build files have been written to: /home/ysj/MatX-master/MatX/build ysj@wz-20230626HPPT:~/MatX-master/MatX/build$ make -j [ 1%] Generate git revision file for nvbench_git_revision [ 5%] Built target gtest [ 5%] Built target fmt [ 5%] Built target nvbench_git_revision_compute_git_info [ 7%] Built target gtest_main [ 9%] Built target gmock [ 10%] Building CXX object _deps/gtest-build/googlemock/CMakeFiles/gmock_main.dir/src/gmock_main.cc.o [ 11%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/axes_metadata.cxx.o [ 12%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/benchmark_base.cxx.o [ 12%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/benchmark_manager.cxx.o [ 13%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/axis_base.cxx.o [ 13%] Building CUDA object examples/CMakeFiles/black_scholes.dir/black_scholes.cu.o [ 16%] Building CUDA object examples/CMakeFiles/resample_poly_bench.dir/resample_poly_bench.cu.o [ 15%] Building CUDA object examples/CMakeFiles/convolution.dir/convolution.cu.o [ 16%] Building CUDA object examples/CMakeFiles/resample.dir/resample.cu.o [ 16%] Building CUDA object examples/CMakeFiles/simple_radar_pipeline.dir/simple_radar_pipeline.cu.o [ 17%] Building CUDA object examples/CMakeFiles/cgsolve.dir/cgsolve.cu.o [ 19%] Building CUDA object examples/CMakeFiles/channelize_poly_bench.dir/channelize_poly_bench.cu.o [ 19%] Building CUDA object examples/CMakeFiles/mvdr_beamformer.dir/mvdr_beamformer.cu.o [ 20%] Building CUDA object examples/CMakeFiles/spherical_harmonics.dir/spherical_harmonics.cu.o [ 21%] Building CUDA object examples/CMakeFiles/fft_conv.dir/fft_conv.cu.o [ 21%] Building CUDA object examples/CMakeFiles/conv2d.dir/conv2d.cu.o [ 22%] Building CUDA object examples/CMakeFiles/spectrogram.dir/spectrogram.cu.o [ 22%] Building CUDA object examples/CMakeFiles/qr.dir/qr.cu.o [ 22%] Building CUDA object examples/CMakeFiles/spectrogram_graph.dir/spectrogram_graph.cu.o [ 24%] Building CUDA object examples/CMakeFiles/svd_power.dir/svd_power.cu.o [ 24%] Building CUDA object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/blocking_kernel.cu.o [ 25%] Building CUDA object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/cuda_call.cu.o [ 26%] Building CUDA object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/csv_printer.cu.o [ 27%] Building CUDA object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/device_manager.cu.o [ 28%] Building CUDA object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/device_info.cu.o [ 28%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/float64_axis.cxx.o [ 29%] Building CUDA object examples/CMakeFiles/recursive_filter.dir/recursive_filter.cu.o [ 30%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/int64_axis.cxx.o [ 31%] Building CUDA object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/markdown_printer.cu.o [ 32%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/named_values.cxx.o [ 33%] Building CUDA object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/option_parser.cu.o [ 34%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/printer_base.cxx.o [ 34%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/printer_multiplex.cxx.o [ 35%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/runner.cxx.o [ 36%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/state.cxx.o [ 38%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/type_axis.cxx.o [ 38%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/string_axis.cxx.o [ 39%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/type_strings.cxx.o [ 39%] Building CUDA object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/detail/measure_cold.cu.o [ 40%] Building CUDA object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/detail/measure_hot.cu.o [ 41%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/detail/state_generator.cxx.o [ 42%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/cupti_profiler.cxx.o [ 43%] Building CUDA object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/detail/measure_cupti.cu.o [ 44%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/internal/nvml.cxx.o [ 44%] Building CUDA object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/json_printer.cu.o [ 45%] Building CUDA object test/CMakeFiles/matx_test.dir/main.cu.o [ 46%] Building CUDA object test/CMakeFiles/matx_test.dir/00_tensor/BasicTensorTests.cu.o [ 46%] Building CUDA object test/CMakeFiles/matx_test.dir/00_tensor/CUBTests.cu.o [ 47%] Building CUDA object test/CMakeFiles/matx_test.dir/00_tensor/ViewTests.cu.o [ 48%] Building CUDA object test/CMakeFiles/matx_test.dir/00_tensor/VizTests.cu.o [ 49%] Building CUDA object test/CMakeFiles/matx_test.dir/00_tensor/EinsumTests.cu.o [ 50%] Building CUDA object test/CMakeFiles/matx_test.dir/00_tensor/TensorCreationTests.cu.o [ 51%] Building CUDA object test/CMakeFiles/matx_test.dir/00_operators/OperatorTests.cu.o [ 51%] Building CUDA object test/CMakeFiles/matx_test.dir/00_operators/GeneratorTests.cu.o [ 52%] Building CUDA object test/CMakeFiles/matx_test.dir/00_operators/ReductionTests.cu.o [ 53%] Building CUDA object test/CMakeFiles/matx_test.dir/00_transform/ConvCorr.cu.o [ 54%] Building CUDA object test/CMakeFiles/matx_test.dir/00_transform/MatMul.cu.o [ 55%] Building CUDA object test/CMakeFiles/matx_test.dir/00_transform/Copy.cu.o [ 56%] Building CUDA object test/CMakeFiles/matx_test.dir/00_transform/ChannelizePoly.cu.o [ 56%] Building CUDA object test/CMakeFiles/matx_test.dir/00_transform/Cov.cu.o [ 57%] Building CUDA object test/CMakeFiles/matx_test.dir/00_transform/FFT.cu.o [ 58%] Building CUDA object test/CMakeFiles/matx_test.dir/00_transform/ResamplePoly.cu.o [ 59%] Building CUDA object test/CMakeFiles/matx_test.dir/00_transform/Solve.cu.o [ 60%] Building CUDA object test/CMakeFiles/matx_test.dir/00_solver/Cholesky.cu.o [ 61%] Building CUDA object test/CMakeFiles/matx_test.dir/00_solver/LU.cu.o [ 62%] Building CUDA object test/CMakeFiles/matx_test.dir/00_solver/QR2.cu.o [ 62%] Building CUDA object test/CMakeFiles/matx_test.dir/00_solver/QR.cu.o [ 63%] Building CUDA object test/CMakeFiles/matx_test.dir/00_solver/SVD.cu.o [ 64%] Building CUDA object test/CMakeFiles/matx_test.dir/00_solver/Eigen.cu.o [ 65%] Building CUDA object test/CMakeFiles/matx_test.dir/00_solver/Det.cu.o [ 66%] Building CUDA object test/CMakeFiles/matx_test.dir/00_solver/Inverse.cu.o [ 66%] Building CUDA object test/CMakeFiles/matx_test.dir/00_operators/PythonEmbed.cu.o [ 67%] Building CUDA object test/CMakeFiles/matx_test.dir/00_io/FileIOTests.cu.o [ 68%] Building CUDA object test/CMakeFiles/matx_test.dir/01_radar/MultiChannelRadarPipeline.cu.o [ 69%] Building CUDA object test/CMakeFiles/matx_test.dir/01_radar/MVDRBeamformer.cu.o [ 70%] Building CUDA object test/CMakeFiles/matx_test.dir/01_radar/ambgfun.cu.o [ 71%] Building CUDA object test/CMakeFiles/matx_test.dir/01_radar/dct.cu.o Killed make[2]: *** [_deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/build.make:465: _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/json_printer.cu.o] Error 137 make[2]: *** Waiting for unfinished jobs.... Killed Killed make[2]: *** [examples/CMakeFiles/qr.dir/build.make:77: examples/CMakeFiles/qr.dir/qr.cu.o] Error 255 Killed make[1]: *** [CMakeFiles/Makefile2:694: examples/CMakeFiles/qr.dir/all] Error 2 make[2]: *** [examples/CMakeFiles/conv2d.dir/build.make:77: examples/CMakeFiles/conv2d.dir/conv2d.cu.o] Error 255 make[1]: *** Waiting for unfinished jobs.... make[1]: *** [CMakeFiles/Makefile2:434: examples/CMakeFiles/conv2d.dir/all] Error 2 make[2]: *** [examples/CMakeFiles/fft_conv.dir/build.make:77: examples/CMakeFiles/fft_conv.dir/fft_conv.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:486: examples/CMakeFiles/fft_conv.dir/all] Error 2 Killed make[2]: *** [examples/CMakeFiles/cgsolve.dir/build.make:77: examples/CMakeFiles/cgsolve.dir/cgsolve.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:460: examples/CMakeFiles/cgsolve.dir/all] Error 2 Killed make[2]: *** [examples/CMakeFiles/spectrogram_graph.dir/build.make:77: examples/CMakeFiles/spectrogram_graph.dir/spectrogram_graph.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:616: examples/CMakeFiles/spectrogram_graph.dir/all] Error 2 Killed make[2]: *** [examples/CMakeFiles/channelize_poly_bench.dir/build.make:77: examples/CMakeFiles/channelize_poly_bench.dir/channelize_poly_bench.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:382: examples/CMakeFiles/channelize_poly_bench.dir/all] Error 2 Killed Killed make[2]: *** [examples/CMakeFiles/simple_radar_pipeline.dir/build.make:77: examples/CMakeFiles/simple_radar_pipeline.dir/simple_radar_pipeline.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:330: examples/CMakeFiles/simple_radar_pipeline.dir/all] Error 2 make[2]: *** [examples/CMakeFiles/resample_poly_bench.dir/build.make:77: examples/CMakeFiles/resample_poly_bench.dir/resample_poly_bench.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:564: examples/CMakeFiles/resample_poly_bench.dir/all] Error 2 Killed make[2]: *** [examples/CMakeFiles/black_scholes.dir/build.make:77: examples/CMakeFiles/black_scholes.dir/black_scholes.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:720: examples/CMakeFiles/black_scholes.dir/all] Error 2 Killed make[2]: *** [examples/CMakeFiles/spherical_harmonics.dir/build.make:77: examples/CMakeFiles/spherical_harmonics.dir/spherical_harmonics.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:642: examples/CMakeFiles/spherical_harmonics.dir/all] Error 2 Killed make[2]: *** [examples/CMakeFiles/spectrogram.dir/build.make:77: examples/CMakeFiles/spectrogram.dir/spectrogram.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:590: examples/CMakeFiles/spectrogram.dir/all] Error 2 Killed make[2]: *** [examples/CMakeFiles/svd_power.dir/build.make:77: examples/CMakeFiles/svd_power.dir/svd_power.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:668: examples/CMakeFiles/svd_power.dir/all] Error 2 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:182: test/CMakeFiles/matx_test.dir/00_operators/OperatorTests.cu.o] Error 255 make[2]: *** Waiting for unfinished jobs.... Killed make[2]: *** [examples/CMakeFiles/recursive_filter.dir/build.make:77: examples/CMakeFiles/recursive_filter.dir/recursive_filter.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:356: examples/CMakeFiles/recursive_filter.dir/all] Error 2 Killed make[2]: *** [examples/CMakeFiles/resample.dir/build.make:77: examples/CMakeFiles/resample.dir/resample.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:512: examples/CMakeFiles/resample.dir/all] Error 2 Killed make[2]: *** [examples/CMakeFiles/convolution.dir/build.make:77: examples/CMakeFiles/convolution.dir/convolution.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:408: examples/CMakeFiles/convolution.dir/all] Error 2 Killed make[2]: *** [examples/CMakeFiles/mvdr_beamformer.dir/build.make:77: examples/CMakeFiles/mvdr_beamformer.dir/mvdr_beamformer.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:538: examples/CMakeFiles/mvdr_beamformer.dir/all] Error 2 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:407: test/CMakeFiles/matx_test.dir/00_solver/SVD.cu.o] Error 255 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:197: test/CMakeFiles/matx_test.dir/00_operators/GeneratorTests.cu.o] Error 255 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:272: test/CMakeFiles/matx_test.dir/00_transform/Copy.cu.o] Error 255 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:347: test/CMakeFiles/matx_test.dir/00_solver/Cholesky.cu.o] Error 255 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:542: test/CMakeFiles/matx_test.dir/01_radar/dct.cu.o] Error 255 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:227: test/CMakeFiles/matx_test.dir/00_transform/ConvCorr.cu.o] Error 255 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:497: test/CMakeFiles/matx_test.dir/01_radar/MultiChannelRadarPipeline.cu.o] Error 255 [ 71%] Linking CXX static library ../../../lib/libgmock_maind.a [ 71%] Built target gmock_main Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:167: test/CMakeFiles/matx_test.dir/00_tensor/EinsumTests.cu.o] Error 255 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:302: test/CMakeFiles/matx_test.dir/00_transform/FFT.cu.o] Error 255 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:152: test/CMakeFiles/matx_test.dir/00_tensor/TensorCreationTests.cu.o] Error 255 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:212: test/CMakeFiles/matx_test.dir/00_operators/ReductionTests.cu.o] Error 255 Killed make[1]: *** [CMakeFiles/Makefile2:827: _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/all] Error 2 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:467: test/CMakeFiles/matx_test.dir/00_operators/PythonEmbed.cu.o] Error 255 Killed Killed Killed Killed Killed Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:437: test/CMakeFiles/matx_test.dir/00_solver/Det.cu.o] Error 255 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:392: test/CMakeFiles/matx_test.dir/00_solver/QR2.cu.o] Error 255 Killed Killed Killed Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:362: test/CMakeFiles/matx_test.dir/00_solver/LU.cu.o] Error 255 Killed Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:452: test/CMakeFiles/matx_test.dir/00_solver/Inverse.cu.o] Error 255 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:242: test/CMakeFiles/matx_test.dir/00_transform/MatMul.cu.o] Error 255 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:317: test/CMakeFiles/matx_test.dir/00_transform/ResamplePoly.cu.o] Error 255 Killed Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:92: test/CMakeFiles/matx_test.dir/00_tensor/BasicTensorTests.cu.o] Error 255 make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:512: test/CMakeFiles/matx_test.dir/01_radar/MVDRBeamformer.cu.o] Error 255 make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:287: test/CMakeFiles/matx_test.dir/00_transform/Cov.cu.o] Error 255 make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:107: test/CMakeFiles/matx_test.dir/00_tensor/CUBTests.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:1063: test/CMakeFiles/matx_test.dir/all] Error 2 make: *** [Makefile:136: all] Error 2
Hi @yuanskinner, that doesn't look like an error, but rather your make parallelism is too high. Try make -j4 so the OOM killer doesn't stop it.
Hi @yuanskinner, that doesn't look like an error, but rather your
makeparallelism is too high. Trymake -j4so the OOM killer doesn't stop it.
it's build success! thanks so much! but I get an error when I run examples/resample.cu GPU Name: NVIDIA GeForce RTX 3080 GPU Global Memory: 9.999512 GB
CUDA Error: invalid device ordinal
matxException (matxCudaError: invalid device ordinal) - /home/MatX-master/MatX/examples/resample.cu:68
(sigViewComplex = fft(sigView)).run(stream);
Stack Trace:
./resample : ()+0xdd93
./resample : ()+0x81d8
/lib/x86_64-linux-gnu/libc.so.6 : __libc_start_main()+0xf3
./resample : ()+0x5f2e
Hi @yuanskinner, do you have multiple GPUs in the system?
Hi @yuanskinner, do you have multiple GPUs in the system?
no, only one rtx 3080 i have tried using cudaGetDeviceCount and return 1 then i tried cudaSetDevice(0) returning cudaSuccess
i also tried other examples,the detail log pls check attachment
only below samples is passed:
black_scholes
cgsolve
conv2d
qr
resample_poly_bench
spherical_harmonics
svd_power
detail log.md
@yuanskinner how much system memory do you have?
@yuanskinner how much system memory do you have?
system memory 32.0 GB
I think the problem is by default we use managed memory. Managed memory does not work well under WSL2 yet:
Unified Memory - Full Managed Memory Support is not available on Windows native and therefore WSL 2 will not support it for the foreseeable future
So you're likely hitting a WSL2+CUDA bug. This is not necessarily a problem; If you would like this to work in WSL2 with your own application, you can just avoid using managed memory and declare all your tensors with host or device memory. The examples will not work as-is, however. Would that work for you?
I just tried the examples on a Geforce 3070, and they all work on Linux, so it appears this is related to the WSL2 issue above.
I just tried the examples on a Geforce 3070, and they all work on Linux, so it appears this is related to the WSL2 issue above.
Thank you for your patient answer,i just build my application in WSL on my home compute, The actual production environment is on a physical machine. actually,i plan to use MATX as our DDC program. why the algorithm channelize_poly do not downsample the input signal into the channels .Will the param decimation_factor be supported in the near future?!
i changed the demo like below ,and the same error in line CUDA_CHECK_LAST_ERROR();
auto input = matx::make_tensor<InType, 2>({num_batches, input_len},MATX_DEVICE_MEMORY,stream);
auto filter = matx::make_tensor<InType, 1>({filter_len},MATX_DEVICE_MEMORY,stream);
auto output = matx::make_tensor<OutType, 3>({num_batches, output_len_per_channel, num_channels},MATX_DEVICE_MEMORY,stream);
const matx::index_t decimation_factor = num_channels;
for (int k = 0; k < NUM_WARMUP_ITERATIONS; k++) {
(output = channelize_poly(input, filter, num_channels, decimation_factor)).run(stream);
}
cudaStreamSynchronize(stream);
float elapsed_ms = 0.0f;
cudaEventRecord(start, stream);
for (int k = 0; k < NUM_ITERATIONS; k++) {
(output = channelize_poly(input, filter, num_channels, decimation_factor)).run(stream);
}
cudaEventRecord(stop, stream);
cudaStreamSynchronize(stream);
CUDA_CHECK_LAST_ERROR();
`CUDA Error: invalid device ordinal matxException (matxCudaError: invalid device ordinal) - /home/ysj/MatX-master/MatX/examples/channelize_poly_bench.cu:103
Stack Trace: ./channelize_poly_bench : ()+0x8e2f ./channelize_poly_bench : ()+0xd636 ./channelize_poly_bench : ()+0x581f /lib/x86_64-linux-gnu/libc.so.6 : __libc_start_main()+0xf3 ./channelize_poly_bench : ()+0x4e6e`
Hi @yuanskinner, I don't have a WSL2 system I can test on easily at the moment. Is it possible for you to try on a Linux machine? In the meantime I can try to get WSL2 working again.
i don't have a Linux machine at the moment.If possible, you can write a test code for me to build and run
why the algorithm channelize_poly do not downsample the input signal into the channels .Will the param decimation_factor be supported in the near future?!
Hi @yuanskinner - the current channelize_poly implementation only supports maximally decimated channelization (i.e., decimation_factor == num_channels). This downsamples the input signal into the channels by a factor of M for M channels such that each channel has a sample rate of fs/M (where fs is the sampling rate of the input signal). It does not yet support oversampling, which would result in sample rates higher than fs/M for each channel. Is oversampling a feature that would be needed for your use case? If so, can you share rough dimensions of interest (input signal length, decimation factor, channel count)?
Hi @yuanskinner, I don't have a WSL2 system I can test on easily at the moment. Is it possible for you to try on a Linux machine? In the meantime I can try to get WSL2 working again. i changed the order of the cases. the test case { 42, 17, 256000 } first 4 channel can success run .
@tbensonatl Our typical scenario is: input signal bandwidth : 72 MHz channel count 1000,with diffrent bandwidth (10Khz~20MHz) Channels may not necessarily be equally spaced this is a ddc Service that can meet the diverse ddc needs of users
@yuanskinner @tbensonatl put in a fix for WSL2. Can you please try the latest commit?
@yuanskinner @tbensonatl put in a fix for WSL2. Can you please try the latest commit?
it's worked!!! thank you so so so so much!
in the reduce.h line 1647
if(++axis_ptr == dims.size()) { should be
if(++axis_ptr == (int)dims.size()) {
error: comparison of integer expressions of different signedness: ‘int’ and ‘std::array<int, 1>::size_type’ {aka ‘long unsigned int’} [-Werror=sign-compare]
Which host compiler are you using?
Which host compiler are you using?
i just follow the doc README.md to compiler the code step by step. CCompiler&CXXCompiler Version is gcc 9.4.0
@yuanskinner can you share your cmake command? I can't reproduce those warnings on g++ 9.4.
@yuanskinner another option is you can paste your entire error output and we can fix it in a branch and have you test.
Jumping on since I am also interested in running MatX on WSL2 and built successfully by staying on the v0.5.0 tag, and cherry-picking the commit @cliffburdick mentioned here.
That being said, the fft_conv example still gives the invalid device orginal exception. Before the cherry-pick, all the examples using fft were failing for me, so that is a good improvement in my view. I can poke at it as I imagine this is somehow still related to the WSL+CUDA bug mentioned earlier.
I think the problem is by default we use managed memory. Managed memory does not work well under WSL2 yet:
Unified Memory - Full Managed Memory Support is not available on Windows native and therefore WSL 2 will not support it for the foreseeable futureSo you're likely hitting a WSL2+CUDA bug. This is not necessarily a problem; If you would like this to work in WSL2 with your own application, you can just avoid using managed memory and declare all your tensors with host or device memory. The examples will not work as-is, however. Would that work for you?
Ah, it gets back to the managed memory, since fft_conv relies on managed memory for the () operator. Outlined here for why I get a seg fault.
Hi @mfzmullen. to be clear, some of the examples use operator() to set up data before the example. This is purely to show how to do it, but as I mentioned above you can just not use managed memory altogether and everything should work under WSL2. There shouldn't be anywhere in the internals of MatX that we require managed memory.