MatX [BUG] Visual studio fails to compile unit tests/examples

Describe the bug CMake generates a solution that fails to compile 11 of 20 projects on VS2022.

From #147 :

The problem is not that nvcc is being passed an incorrect flag, but rather that fvisibility is not valid on VS. We use the option -forward-unknown-to-host-compiler, so any unknown parameter (of which this is one), nvcc will automatically forward to VS.

To Reproduce Steps to reproduce the behavior:

cmake -DMATX_BUILD_TESTS=ON -DMATX_BUILD_BENCHMARKS=ON -DMATX_BUILD_EXAMPLES=ON -DMATX_BUILD_DOCS=OFF -DCMAKE_CUDA_ARCHITECTURES=52 -DCMAKE_BUILD_TYPE=Debug ..

Expected behavior Expect to successfully compile all unit tests & examples.

Code snippers output log attached.

System details (please complete the following information):

Windows 10 Pro
CMake 3.22.1
VS2022 (MSVC 19.30.30706.0)
CUDA 11.6
pybind11 2.6.2

Mar 18 '22 15:03 akfite

@akfite I've made a quick attempt at this and fixed a few things along the way, but it's still showing lots of compiler errors. I'll keep looking at it, but it could take a while.

Mar 19 '22 13:03 cliffburdick

@akfite, I think the number of errors are so numerous that it's going to take a significant amount of time to get this compiling on VS. MSVC has a lot of problems with SFINAE and some C++17 features, which we rely on heavily. I will leave this open, but until we have the time or more use cases, it's likely going to be lower priority than other features.

If anyone more familiar with the MSVC bugs would like to take a shot at it, we're open to PRs as well.

Mar 19 '22 19:03 cliffburdick

@cliffburdick All good. If you can figure it out someday that would be awesome, but I think I can at least use your code as reference in the meantime. I wish I could fix it myself but I'm way out of my depth here.

Thank you for looking into it!

Mar 19 '22 21:03 akfite

Is WSL supported?

Oct 22 '22 05:10 sahuang

hi @sahuang, I was able to compile and run on wsl 2 with cuda installed.

Oct 22 '22 05:10 cliffburdick

hi @sahuang, I was able to compile and run on wsl 2 with cuda installed. hi @cliffburdick ,when i compile on wls2,a lot of error 255 like below: Environment： Windows 11 Cmake 3.27.3 CUDA 12.2.1 pybind11 v2.6.2 cmake -DMATX_BUILD_TESTS=ON -DMATX_BUILD_BENCHMARKS=ON -DMATX_BUILD_EXAMPLES=ON -DMATX_BUILD_DOCS=OFF .. -DCMAKE_CUDA_ARCHITECTURES="80;86" -- Using GPU architectures 80;86 -- Need libcuda++. Finding... -- CPM: adding package [email protected] (2.1.0) -- Enabling pybind11 support -- CPM: adding package [email protected] (v2.6.2) CMake Deprecation Warning at build/_deps/pybind11-src/CMakeLists.txt:8 (cmake_minimum_required): Compatibility with CMake < 3.5 will be removed from a future version of CMake.

Update the VERSION argument value or use a ... suffix to tell CMake that the project does not need compatibility with older versions.

-- pybind11 v2.6.2 CMake Warning (dev) at build/_deps/pybind11-src/tools/FindPythonLibsNew.cmake:98 (find_package): Policy CMP0148 is not set: The FindPythonInterp and FindPythonLibs modules are removed. Run "cmake --help-policy CMP0148" for policy details. Use the cmake_policy command to set the policy and suppress this warning.

Call Stack (most recent call first): build/_deps/pybind11-src/tools/pybind11Tools.cmake:45 (find_package) build/_deps/pybind11-src/tools/pybind11Common.cmake:201 (include) build/_deps/pybind11-src/CMakeLists.txt:188 (include) This warning is for project developers. Use -Wno-dev to suppress it.

-- checking python import module numpy -- checking python import module cupy Traceback (most recent call last): File "", line 1, in ModuleNotFoundError: No module named 'cupy' -- The optional python package cupy package is not installed. Some unit tests and functionality may not work -- CPM: adding package nvbench@1 (1a13a2e) -- CPM: nvbench: adding package [email protected] (7.1.3) -- Version: 7.1.3 -- Build type: Debug -- CXX_STANDARD: 17 -- Required features: cxx_variadic_templates -- CPM: nvbench: adding package [email protected] (3.9.1) CMake Warning (dev) at /home/ysj/cmake-3.27.3-linux-x86_64/share/cmake-3.27/Modules/FetchContent.cmake:1316 (message): The DOWNLOAD_EXTRACT_TIMESTAMP option was not given and policy CMP0135 is not set. The policy's OLD behavior will be used. When using a URL download, the timestamps of extracted files should preferably be that of the time of extraction, otherwise code that depends on the extracted contents might not be rebuilt if the URL changes. The OLD behavior preserves the timestamps from the archive instead, but this is usually not what you want. Update your project to the NEW behavior or specify the DOWNLOAD_EXTRACT_TIMESTAMP option with a value of true to avoid this robustness issue. Call Stack (most recent call first): public/cpm-cmake/cmake/CPM.cmake:698 (FetchContent_Declare) public/cpm-cmake/cmake/CPM.cmake:587 (cpm_declare_fetch) public/cpm-cmake/cmake/CPM.cmake:258 (CPMAddPackage) cmake/rapids-cmake/rapids-cmake/cpm/find.cmake:152 (CPMFindPackage) build/_deps/nvbench-src/cmake/NVBenchDependencies.cmake:20 (rapids_cpm_find) build/_deps/nvbench-src/CMakeLists.txt:46 (include) This warning is for project developers. Use -Wno-dev to suppress it.

-- NVBench CUDA architectures: 80;86 -- CPM: adding package [email protected] (release-1.11.0) CMake Deprecation Warning at build/_deps/gtest-src/CMakeLists.txt:4 (cmake_minimum_required): Compatibility with CMake < 3.5 will be removed from a future version of CMake.

Update the VERSION argument value or use a ... suffix to tell CMake that the project does not need compatibility with older versions.

CMake Deprecation Warning at build/_deps/gtest-src/googlemock/CMakeLists.txt:45 (cmake_minimum_required): Compatibility with CMake < 3.5 will be removed from a future version of CMake.

Update the VERSION argument value or use a ... suffix to tell CMake that the project does not need compatibility with older versions.

CMake Deprecation Warning at build/_deps/gtest-src/googletest/CMakeLists.txt:56 (cmake_minimum_required): Compatibility with CMake < 3.5 will be removed from a future version of CMake.

Update the VERSION argument value or use a ... suffix to tell CMake that the project does not need compatibility with older versions.

-- Configuring done (8.8s) CMake Warning at build/_deps/nvbench-src/exec/CMakeLists.txt:1 (add_executable): Cannot generate a safe runtime search path for target nvbench.ctl because files in some directories may conflict with libraries in implicit directories:

runtime library [libnvidia-ml.so.1] in /usr/local/cuda-12.2/targets/x86_64-linux/lib/stubs may be hidden by files in:
  /usr/lib/wsl/lib

Some of these libraries may not be found correctly.

CMake Warning at bench/CMakeLists.txt:22 (add_executable): Cannot generate a safe runtime search path for target matx_bench because files in some directories may conflict with libraries in implicit directories:

runtime library [libnvidia-ml.so.1] in /usr/local/cuda-12.2/targets/x86_64-linux/lib/stubs may be hidden by files in:
  /usr/lib/wsl/lib

Some of these libraries may not be found correctly.

-- Generating done (0.3s) -- Build files have been written to: /home/ysj/MatX-master/MatX/build ysj@wz-20230626HPPT:~/MatX-master/MatX/build$ make -j [ 1%] Generate git revision file for nvbench_git_revision [ 5%] Built target gtest [ 5%] Built target fmt [ 5%] Built target nvbench_git_revision_compute_git_info [ 7%] Built target gtest_main [ 9%] Built target gmock [ 10%] Building CXX object _deps/gtest-build/googlemock/CMakeFiles/gmock_main.dir/src/gmock_main.cc.o [ 11%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/axes_metadata.cxx.o [ 12%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/benchmark_base.cxx.o [ 12%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/benchmark_manager.cxx.o [ 13%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/axis_base.cxx.o [ 13%] Building CUDA object examples/CMakeFiles/black_scholes.dir/black_scholes.cu.o [ 16%] Building CUDA object examples/CMakeFiles/resample_poly_bench.dir/resample_poly_bench.cu.o [ 15%] Building CUDA object examples/CMakeFiles/convolution.dir/convolution.cu.o [ 16%] Building CUDA object examples/CMakeFiles/resample.dir/resample.cu.o [ 16%] Building CUDA object examples/CMakeFiles/simple_radar_pipeline.dir/simple_radar_pipeline.cu.o [ 17%] Building CUDA object examples/CMakeFiles/cgsolve.dir/cgsolve.cu.o [ 19%] Building CUDA object examples/CMakeFiles/channelize_poly_bench.dir/channelize_poly_bench.cu.o [ 19%] Building CUDA object examples/CMakeFiles/mvdr_beamformer.dir/mvdr_beamformer.cu.o [ 20%] Building CUDA object examples/CMakeFiles/spherical_harmonics.dir/spherical_harmonics.cu.o [ 21%] Building CUDA object examples/CMakeFiles/fft_conv.dir/fft_conv.cu.o [ 21%] Building CUDA object examples/CMakeFiles/conv2d.dir/conv2d.cu.o [ 22%] Building CUDA object examples/CMakeFiles/spectrogram.dir/spectrogram.cu.o [ 22%] Building CUDA object examples/CMakeFiles/qr.dir/qr.cu.o [ 22%] Building CUDA object examples/CMakeFiles/spectrogram_graph.dir/spectrogram_graph.cu.o [ 24%] Building CUDA object examples/CMakeFiles/svd_power.dir/svd_power.cu.o [ 24%] Building CUDA object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/blocking_kernel.cu.o [ 25%] Building CUDA object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/cuda_call.cu.o [ 26%] Building CUDA object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/csv_printer.cu.o [ 27%] Building CUDA object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/device_manager.cu.o [ 28%] Building CUDA object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/device_info.cu.o [ 28%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/float64_axis.cxx.o [ 29%] Building CUDA object examples/CMakeFiles/recursive_filter.dir/recursive_filter.cu.o [ 30%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/int64_axis.cxx.o [ 31%] Building CUDA object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/markdown_printer.cu.o [ 32%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/named_values.cxx.o [ 33%] Building CUDA object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/option_parser.cu.o [ 34%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/printer_base.cxx.o [ 34%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/printer_multiplex.cxx.o [ 35%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/runner.cxx.o [ 36%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/state.cxx.o [ 38%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/type_axis.cxx.o [ 38%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/string_axis.cxx.o [ 39%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/type_strings.cxx.o [ 39%] Building CUDA object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/detail/measure_cold.cu.o [ 40%] Building CUDA object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/detail/measure_hot.cu.o [ 41%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/detail/state_generator.cxx.o [ 42%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/cupti_profiler.cxx.o [ 43%] Building CUDA object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/detail/measure_cupti.cu.o [ 44%] Building CXX object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/internal/nvml.cxx.o [ 44%] Building CUDA object _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/json_printer.cu.o [ 45%] Building CUDA object test/CMakeFiles/matx_test.dir/main.cu.o [ 46%] Building CUDA object test/CMakeFiles/matx_test.dir/00_tensor/BasicTensorTests.cu.o [ 46%] Building CUDA object test/CMakeFiles/matx_test.dir/00_tensor/CUBTests.cu.o [ 47%] Building CUDA object test/CMakeFiles/matx_test.dir/00_tensor/ViewTests.cu.o [ 48%] Building CUDA object test/CMakeFiles/matx_test.dir/00_tensor/VizTests.cu.o [ 49%] Building CUDA object test/CMakeFiles/matx_test.dir/00_tensor/EinsumTests.cu.o [ 50%] Building CUDA object test/CMakeFiles/matx_test.dir/00_tensor/TensorCreationTests.cu.o [ 51%] Building CUDA object test/CMakeFiles/matx_test.dir/00_operators/OperatorTests.cu.o [ 51%] Building CUDA object test/CMakeFiles/matx_test.dir/00_operators/GeneratorTests.cu.o [ 52%] Building CUDA object test/CMakeFiles/matx_test.dir/00_operators/ReductionTests.cu.o [ 53%] Building CUDA object test/CMakeFiles/matx_test.dir/00_transform/ConvCorr.cu.o [ 54%] Building CUDA object test/CMakeFiles/matx_test.dir/00_transform/MatMul.cu.o [ 55%] Building CUDA object test/CMakeFiles/matx_test.dir/00_transform/Copy.cu.o [ 56%] Building CUDA object test/CMakeFiles/matx_test.dir/00_transform/ChannelizePoly.cu.o [ 56%] Building CUDA object test/CMakeFiles/matx_test.dir/00_transform/Cov.cu.o [ 57%] Building CUDA object test/CMakeFiles/matx_test.dir/00_transform/FFT.cu.o [ 58%] Building CUDA object test/CMakeFiles/matx_test.dir/00_transform/ResamplePoly.cu.o [ 59%] Building CUDA object test/CMakeFiles/matx_test.dir/00_transform/Solve.cu.o [ 60%] Building CUDA object test/CMakeFiles/matx_test.dir/00_solver/Cholesky.cu.o [ 61%] Building CUDA object test/CMakeFiles/matx_test.dir/00_solver/LU.cu.o [ 62%] Building CUDA object test/CMakeFiles/matx_test.dir/00_solver/QR2.cu.o [ 62%] Building CUDA object test/CMakeFiles/matx_test.dir/00_solver/QR.cu.o [ 63%] Building CUDA object test/CMakeFiles/matx_test.dir/00_solver/SVD.cu.o [ 64%] Building CUDA object test/CMakeFiles/matx_test.dir/00_solver/Eigen.cu.o [ 65%] Building CUDA object test/CMakeFiles/matx_test.dir/00_solver/Det.cu.o [ 66%] Building CUDA object test/CMakeFiles/matx_test.dir/00_solver/Inverse.cu.o [ 66%] Building CUDA object test/CMakeFiles/matx_test.dir/00_operators/PythonEmbed.cu.o [ 67%] Building CUDA object test/CMakeFiles/matx_test.dir/00_io/FileIOTests.cu.o [ 68%] Building CUDA object test/CMakeFiles/matx_test.dir/01_radar/MultiChannelRadarPipeline.cu.o [ 69%] Building CUDA object test/CMakeFiles/matx_test.dir/01_radar/MVDRBeamformer.cu.o [ 70%] Building CUDA object test/CMakeFiles/matx_test.dir/01_radar/ambgfun.cu.o [ 71%] Building CUDA object test/CMakeFiles/matx_test.dir/01_radar/dct.cu.o Killed make[2]: *** [_deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/build.make:465: _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/json_printer.cu.o] Error 137 make[2]: *** Waiting for unfinished jobs.... Killed Killed make[2]: *** [examples/CMakeFiles/qr.dir/build.make:77: examples/CMakeFiles/qr.dir/qr.cu.o] Error 255 Killed make[1]: *** [CMakeFiles/Makefile2:694: examples/CMakeFiles/qr.dir/all] Error 2 make[2]: *** [examples/CMakeFiles/conv2d.dir/build.make:77: examples/CMakeFiles/conv2d.dir/conv2d.cu.o] Error 255 make[1]: *** Waiting for unfinished jobs.... make[1]: *** [CMakeFiles/Makefile2:434: examples/CMakeFiles/conv2d.dir/all] Error 2 make[2]: *** [examples/CMakeFiles/fft_conv.dir/build.make:77: examples/CMakeFiles/fft_conv.dir/fft_conv.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:486: examples/CMakeFiles/fft_conv.dir/all] Error 2 Killed make[2]: *** [examples/CMakeFiles/cgsolve.dir/build.make:77: examples/CMakeFiles/cgsolve.dir/cgsolve.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:460: examples/CMakeFiles/cgsolve.dir/all] Error 2 Killed make[2]: *** [examples/CMakeFiles/spectrogram_graph.dir/build.make:77: examples/CMakeFiles/spectrogram_graph.dir/spectrogram_graph.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:616: examples/CMakeFiles/spectrogram_graph.dir/all] Error 2 Killed make[2]: *** [examples/CMakeFiles/channelize_poly_bench.dir/build.make:77: examples/CMakeFiles/channelize_poly_bench.dir/channelize_poly_bench.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:382: examples/CMakeFiles/channelize_poly_bench.dir/all] Error 2 Killed Killed make[2]: *** [examples/CMakeFiles/simple_radar_pipeline.dir/build.make:77: examples/CMakeFiles/simple_radar_pipeline.dir/simple_radar_pipeline.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:330: examples/CMakeFiles/simple_radar_pipeline.dir/all] Error 2 make[2]: *** [examples/CMakeFiles/resample_poly_bench.dir/build.make:77: examples/CMakeFiles/resample_poly_bench.dir/resample_poly_bench.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:564: examples/CMakeFiles/resample_poly_bench.dir/all] Error 2 Killed make[2]: *** [examples/CMakeFiles/black_scholes.dir/build.make:77: examples/CMakeFiles/black_scholes.dir/black_scholes.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:720: examples/CMakeFiles/black_scholes.dir/all] Error 2 Killed make[2]: *** [examples/CMakeFiles/spherical_harmonics.dir/build.make:77: examples/CMakeFiles/spherical_harmonics.dir/spherical_harmonics.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:642: examples/CMakeFiles/spherical_harmonics.dir/all] Error 2 Killed make[2]: *** [examples/CMakeFiles/spectrogram.dir/build.make:77: examples/CMakeFiles/spectrogram.dir/spectrogram.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:590: examples/CMakeFiles/spectrogram.dir/all] Error 2 Killed make[2]: *** [examples/CMakeFiles/svd_power.dir/build.make:77: examples/CMakeFiles/svd_power.dir/svd_power.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:668: examples/CMakeFiles/svd_power.dir/all] Error 2 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:182: test/CMakeFiles/matx_test.dir/00_operators/OperatorTests.cu.o] Error 255 make[2]: *** Waiting for unfinished jobs.... Killed make[2]: *** [examples/CMakeFiles/recursive_filter.dir/build.make:77: examples/CMakeFiles/recursive_filter.dir/recursive_filter.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:356: examples/CMakeFiles/recursive_filter.dir/all] Error 2 Killed make[2]: *** [examples/CMakeFiles/resample.dir/build.make:77: examples/CMakeFiles/resample.dir/resample.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:512: examples/CMakeFiles/resample.dir/all] Error 2 Killed make[2]: *** [examples/CMakeFiles/convolution.dir/build.make:77: examples/CMakeFiles/convolution.dir/convolution.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:408: examples/CMakeFiles/convolution.dir/all] Error 2 Killed make[2]: *** [examples/CMakeFiles/mvdr_beamformer.dir/build.make:77: examples/CMakeFiles/mvdr_beamformer.dir/mvdr_beamformer.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:538: examples/CMakeFiles/mvdr_beamformer.dir/all] Error 2 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:407: test/CMakeFiles/matx_test.dir/00_solver/SVD.cu.o] Error 255 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:197: test/CMakeFiles/matx_test.dir/00_operators/GeneratorTests.cu.o] Error 255 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:272: test/CMakeFiles/matx_test.dir/00_transform/Copy.cu.o] Error 255 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:347: test/CMakeFiles/matx_test.dir/00_solver/Cholesky.cu.o] Error 255 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:542: test/CMakeFiles/matx_test.dir/01_radar/dct.cu.o] Error 255 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:227: test/CMakeFiles/matx_test.dir/00_transform/ConvCorr.cu.o] Error 255 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:497: test/CMakeFiles/matx_test.dir/01_radar/MultiChannelRadarPipeline.cu.o] Error 255 [ 71%] Linking CXX static library ../../../lib/libgmock_maind.a [ 71%] Built target gmock_main Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:167: test/CMakeFiles/matx_test.dir/00_tensor/EinsumTests.cu.o] Error 255 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:302: test/CMakeFiles/matx_test.dir/00_transform/FFT.cu.o] Error 255 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:152: test/CMakeFiles/matx_test.dir/00_tensor/TensorCreationTests.cu.o] Error 255 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:212: test/CMakeFiles/matx_test.dir/00_operators/ReductionTests.cu.o] Error 255 Killed make[1]: *** [CMakeFiles/Makefile2:827: _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/all] Error 2 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:467: test/CMakeFiles/matx_test.dir/00_operators/PythonEmbed.cu.o] Error 255 Killed Killed Killed Killed Killed Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:437: test/CMakeFiles/matx_test.dir/00_solver/Det.cu.o] Error 255 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:392: test/CMakeFiles/matx_test.dir/00_solver/QR2.cu.o] Error 255 Killed Killed Killed Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:362: test/CMakeFiles/matx_test.dir/00_solver/LU.cu.o] Error 255 Killed Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:452: test/CMakeFiles/matx_test.dir/00_solver/Inverse.cu.o] Error 255 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:242: test/CMakeFiles/matx_test.dir/00_transform/MatMul.cu.o] Error 255 Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:317: test/CMakeFiles/matx_test.dir/00_transform/ResamplePoly.cu.o] Error 255 Killed Killed make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:92: test/CMakeFiles/matx_test.dir/00_tensor/BasicTensorTests.cu.o] Error 255 make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:512: test/CMakeFiles/matx_test.dir/01_radar/MVDRBeamformer.cu.o] Error 255 make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:287: test/CMakeFiles/matx_test.dir/00_transform/Cov.cu.o] Error 255 make[2]: *** [test/CMakeFiles/matx_test.dir/build.make:107: test/CMakeFiles/matx_test.dir/00_tensor/CUBTests.cu.o] Error 255 make[1]: *** [CMakeFiles/Makefile2:1063: test/CMakeFiles/matx_test.dir/all] Error 2 make: *** [Makefile:136: all] Error 2

Aug 20 '23 11:08 yuanskinner

Hi @yuanskinner, that doesn't look like an error, but rather your make parallelism is too high. Try make -j4 so the OOM killer doesn't stop it.

Aug 21 '23 01:08 cliffburdick

Hi @yuanskinner, that doesn't look like an error, but rather your make parallelism is too high. Try make -j4 so the OOM killer doesn't stop it.

it's build success! thanks so much! but I get an error when I run examples/resample.cu GPU Name: NVIDIA GeForce RTX 3080 GPU Global Memory: 9.999512 GB

CUDA Error: invalid device ordinal matxException (matxCudaError: invalid device ordinal) - /home/MatX-master/MatX/examples/resample.cu:68 (sigViewComplex = fft(sigView)).run(stream); Stack Trace: ./resample : ()+0xdd93 ./resample : ()+0x81d8 /lib/x86_64-linux-gnu/libc.so.6 : __libc_start_main()+0xf3 ./resample : ()+0x5f2e

Aug 21 '23 14:08 yuanskinner

Hi @yuanskinner, do you have multiple GPUs in the system?

Aug 21 '23 20:08 cliffburdick

Hi @yuanskinner, do you have multiple GPUs in the system?

no, only one rtx 3080 i have tried using cudaGetDeviceCount and return 1 then i tried cudaSetDevice(0) returning cudaSuccess

i also tried other examples,the detail log pls check attachment only below samples is passed: black_scholes
cgsolve
conv2d qr resample_poly_bench spherical_harmonics svd_power detail log.md

Aug 23 '23 15:08 yuanskinner

@yuanskinner how much system memory do you have?

Aug 23 '23 15:08 cliffburdick

@yuanskinner how much system memory do you have?

system memory 32.0 GB

Aug 23 '23 15:08 yuanskinner

I think the problem is by default we use managed memory. Managed memory does not work well under WSL2 yet:

Unified Memory - Full Managed Memory Support is not available on Windows native and therefore WSL 2 will not support it for the foreseeable future

So you're likely hitting a WSL2+CUDA bug. This is not necessarily a problem; If you would like this to work in WSL2 with your own application, you can just avoid using managed memory and declare all your tensors with host or device memory. The examples will not work as-is, however. Would that work for you?

Aug 23 '23 15:08 cliffburdick

I just tried the examples on a Geforce 3070, and they all work on Linux, so it appears this is related to the WSL2 issue above.

Aug 23 '23 16:08 cliffburdick

I just tried the examples on a Geforce 3070, and they all work on Linux, so it appears this is related to the WSL2 issue above.

Thank you for your patient answer，i just build my application in WSL on my home compute, The actual production environment is on a physical machine. actually，i plan to use MATX as our DDC program. why the algorithm channelize_poly do not downsample the input signal into the channels .Will the param decimation_factor be supported in the near future？！

Aug 23 '23 16:08 yuanskinner

i changed the demo like below ,and the same error in line CUDA_CHECK_LAST_ERROR(); auto input = matx::make_tensor<InType, 2>({num_batches, input_len},MATX_DEVICE_MEMORY,stream);
auto filter = matx::make_tensor<InType, 1>({filter_len},MATX_DEVICE_MEMORY,stream); auto output = matx::make_tensor<OutType, 3>({num_batches, output_len_per_channel, num_channels},MATX_DEVICE_MEMORY,stream);

  const matx::index_t decimation_factor = num_channels;

  for (int k = 0; k < NUM_WARMUP_ITERATIONS; k++) {        
    (output = channelize_poly(input, filter, num_channels, decimation_factor)).run(stream);
  }

  cudaStreamSynchronize(stream);

  float elapsed_ms = 0.0f;
  cudaEventRecord(start, stream);
  for (int k = 0; k < NUM_ITERATIONS; k++) {
    (output = channelize_poly(input, filter, num_channels, decimation_factor)).run(stream);
  }
  cudaEventRecord(stop, stream);
  cudaStreamSynchronize(stream);
  CUDA_CHECK_LAST_ERROR();

`CUDA Error: invalid device ordinal matxException (matxCudaError: invalid device ordinal) - /home/ysj/MatX-master/MatX/examples/channelize_poly_bench.cu:103

Stack Trace: ./channelize_poly_bench : ()+0x8e2f ./channelize_poly_bench : ()+0xd636 ./channelize_poly_bench : ()+0x581f /lib/x86_64-linux-gnu/libc.so.6 : __libc_start_main()+0xf3 ./channelize_poly_bench : ()+0x4e6e`

Aug 23 '23 16:08 yuanskinner

Hi @yuanskinner, I don't have a WSL2 system I can test on easily at the moment. Is it possible for you to try on a Linux machine? In the meantime I can try to get WSL2 working again.

Aug 23 '23 16:08 cliffburdick

i don't have a Linux machine at the moment.If possible, you can write a test code for me to build and run

Aug 23 '23 17:08 yuanskinner

why the algorithm channelize_poly do not downsample the input signal into the channels .Will the param decimation_factor be supported in the near future？！

Hi @yuanskinner - the current channelize_poly implementation only supports maximally decimated channelization (i.e., decimation_factor == num_channels). This downsamples the input signal into the channels by a factor of M for M channels such that each channel has a sample rate of fs/M (where fs is the sampling rate of the input signal). It does not yet support oversampling, which would result in sample rates higher than fs/M for each channel. Is oversampling a feature that would be needed for your use case? If so, can you share rough dimensions of interest (input signal length, decimation factor, channel count)?

Aug 23 '23 17:08 tbensonatl

Hi @yuanskinner, I don't have a WSL2 system I can test on easily at the moment. Is it possible for you to try on a Linux machine? In the meantime I can try to get WSL2 working again. i changed the order of the cases. the test case { 42, 17, 256000 } first 4 channel can success run .

Aug 23 '23 17:08 yuanskinner

@tbensonatl Our typical scenario is： input signal bandwidth : 72 MHz channel count 1000,with diffrent bandwidth (10Khz~20MHz) Channels may not necessarily be equally spaced this is a ddc Service that can meet the diverse ddc needs of users

Aug 23 '23 17:08 yuanskinner

@yuanskinner @tbensonatl put in a fix for WSL2. Can you please try the latest commit?

Aug 23 '23 19:08 cliffburdick

@yuanskinner @tbensonatl put in a fix for WSL2. Can you please try the latest commit?

it's worked!!! thank you so so so so much!

in the reduce.h line 1647 if(++axis_ptr == dims.size()) { should be if(++axis_ptr == (int)dims.size()) {

error: comparison of integer expressions of different signedness: ‘int’ and ‘std::array<int, 1>::size_type’ {aka ‘long unsigned int’} [-Werror=sign-compare]

Aug 24 '23 01:08 yuanskinner

Which host compiler are you using?

Aug 24 '23 01:08 cliffburdick

Which host compiler are you using?

i just follow the doc README.md to compiler the code step by step. CCompiler&CXXCompiler Version is gcc 9.4.0

Aug 24 '23 16:08 yuanskinner

@yuanskinner can you share your cmake command? I can't reproduce those warnings on g++ 9.4.

Aug 24 '23 20:08 cliffburdick

@yuanskinner another option is you can paste your entire error output and we can fix it in a branch and have you test.

Aug 25 '23 15:08 cliffburdick

Jumping on since I am also interested in running MatX on WSL2 and built successfully by staying on the v0.5.0 tag, and cherry-picking the commit @cliffburdick mentioned here.

That being said, the fft_conv example still gives the invalid device orginal exception. Before the cherry-pick, all the examples using fft were failing for me, so that is a good improvement in my view. I can poke at it as I imagine this is somehow still related to the WSL+CUDA bug mentioned earlier.

I think the problem is by default we use managed memory. Managed memory does not work well under WSL2 yet:
Unified Memory - Full Managed Memory Support is not available on Windows native and therefore WSL 2 will not support it for the foreseeable future
So you're likely hitting a WSL2+CUDA bug. This is not necessarily a problem; If you would like this to work in WSL2 with your own application, you can just avoid using managed memory and declare all your tensors with host or device memory. The examples will not work as-is, however. Would that work for you?

Sep 20 '23 19:09 mfzmullen

Ah, it gets back to the managed memory, since fft_conv relies on managed memory for the () operator. Outlined here for why I get a seg fault.

Sep 20 '23 20:09 mfzmullen

Hi @mfzmullen. to be clear, some of the examples use operator() to set up data before the example. This is purely to show how to do it, but as I mentioned above you can just not use managed memory altogether and everything should work under WSL2. There shouldn't be anywhere in the internals of MatX that we require managed memory.

Sep 20 '23 20:09 cliffburdick