llvm icon indicating copy to clipboard operation
llvm copied to clipboard

Can't use CUDA build with CMake on Windows

Open matthewphilipdixon opened this issue 4 years ago • 6 comments

I've built LLVM with CUDA support on Windows and used the resulting clang++ to build the sample program as specified here: https://intel.github.io/llvm-docs/GetStartedGuide.html And everything works as expected. But when I try to build that same sample program using CMake it fails at the cxx compiler check step during the linking with a "program is not executable" error. I believe it's looking for lld-link.exe which doesn't exist for this llvm build. I've noticed when I look at verbose output of the clang++ command in the example it uses the MS link.exe.

My CmakeLists.txt: cmake_minimum_required(VERSION 3.0.0) set(CMAKE_CXX_COMPILER "clang++") set(CMAKE_CXX_STANDARD 17) project(SYCL_TEST LANGUAGES CXX) add_compile_options(-fsycl -fsycl-targets=nvptx64-nvidia-cuda -v) add_executable(SYCL_TEST simple-sycl-app.cpp) target_link_libraries(SYCL_TEST -fsycl -fsycl-targets=nvptx64-nvidia-cuda -lsycl -v)

I've tried to add the lld project to the llvm build using "--llvm-external-projects lld" for configure.py. The output indicates that it added the project, but I don't have and lld.exe or lld-link.exe anywhere after running compile.py

I've tried to merge the CUDA llvm into the OneAPI folder at "C:\Program Files (x86)\Intel\oneAPI\compiler\2022.0.3", but then I get an error saying that clang doesn't understand the param "--dpcpp"

I can get the sample to build using CMake if I manually change lines 76 and 79 of "C:\Program Files\CMake\share\cmake-3.23\Modules\Platform\Windows-Clang.cmake" from "-fuse-ld=lld-link" to "-fuse-ld=link", but the resulting executable just crashes.

I can provide any environment vars needed, or output from various build attempts.

Any help would be appreciated!

matthewphilipdixon avatar Apr 19 '22 15:04 matthewphilipdixon

@matthewphilipdixon , Run sycl-ls to make sure the CUDA device is present and configured properly.

If so, then are you able to build your app directly? I'm not sure to which sample you refer, but something like: clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda -o philsapp.bin the-best-sample.cpp
The output there should help.

cperkinsintel avatar Apr 19 '22 18:04 cperkinsintel

@cperkinsintel ,

output of sycl-ls:

[opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device 1.2 [2021.13.11.0.23_160000]
[opencl:cpu:1] Intel(R) OpenCL, Intel(R) Core(TM) i7-6950X CPU @ 3.00GHz 3.0 [2021.13.11.0.23_160000]
[ext_oneapi_cuda:gpu:0] NVIDIA CUDA BACKEND, NVIDIA GeForce GTX 1080 Ti 0.0 [CUDA 11.6]
[ext_oneapi_cuda:gpu:1] NVIDIA CUDA BACKEND, NVIDIA GeForce GTX 1080 Ti 0.0 [CUDA 11.6]
[host:host:0] SYCL host platform, SYCL host device 1.2 [1.2]

The sample I refer to is here: https://intel.github.io/llvm-docs/GetStartedGuide.html In the section "Run simple DPC++ application" I'm using the cuda build command: clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda simple-sycl-app.cpp -o simple-sycl-app-cuda.exe

That build actually works, and generates a working .exe that I can confirm is using the Nvidia hardware.

The problems only seem to occur with CMake.

matthewphilipdixon avatar Apr 20 '22 14:04 matthewphilipdixon

I should also add that using the clang++ command that you supplied, or using the clang++ command that I mentioned above I get these same warnings:

C:\Users\matth\Desktop\SYCL_TEST>clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda -o simple-sycl-app.bin simple-sycl-app.cpp
clang++: warning: CUDA version is newer than the latest supported version 11.5 [-Wunknown-cuda-version]
warning: linking module
      'C:\Users\matth\sycl_workspace\llvm\build\lib\clang\15.0.0\../../clc\remangled-l32-signed_char.libspirv-nvptx64--nvidiacl.bc':
      Linking two modules of different target triples:
      'C:\Users\matth\sycl_workspace\llvm\build\lib\clang\15.0.0\../../clc\remangled-l32-signed_char.libspirv-nvptx64--nvidiacl.bc'
      is 'nvptx64-unknown-nvidiacl' whereas 'simple-sycl-app.cpp' is 'nvptx64-nvidia-cuda'
 [-Wlinker-warnings]
1 warning generated.

matthewphilipdixon avatar Apr 20 '22 17:04 matthewphilipdixon

I've got it working, but I don't think it's quite right. It takes a very long time to build. And building in debug creates a program that either crashes or scrambles the data. Below are all the related files.

My CMakeLists.txt:

cmake_minimum_required(VERSION 3.21.0)
set(CMAKE_CXX_COMPILER "clang++")
set(CMAKE_RC_COMPILER "rc.exe")
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_EXE_LINKER_FLAGS_INIT "-fuse-ld=link.exe")
project(simple-sycl-app-cuda LANGUAGES CXX)
add_compile_options(-fsycl -fsycl-targets=nvptx64-nvidia-cuda,spir64_x86_64 -v)
add_executable(simple-sycl-app-cuda simple-sycl-app.cpp)
target_link_libraries(simple-sycl-app-cuda -fsycl -fsycl-targets=nvptx64-nvidia-cuda,spir64_x86_64 -lsycl -v)

My batch script: (I'm setting all the PATH and LIB because I'm running it from a regular cmd, not VS cmd)

del .ninja_deps
del .ninja_log
del build.ninja
del cmake_install.cmake
del CMakeCache.txt
rmdir /s /q CMakeFiles
del simple-sycl-app-cuda.exe
del simple-sycl-app-cuda.ilk
del simple-sycl-app-cuda.pdb
set DPCPP_HOME=D:\SYCLCompiler
set PATH=%DPCPP_HOME%\llvm\build\bin;
set PATH=C:\Program Files (x86)\Common Files\Intel\Shared Libraries\intel64;%PATH%
set PATH=C:\Program Files\CMake\bin;%PATH%
set PATH=C:\Program Files\Ninja;%PATH%
set PATH=C:\Program Files (x86)\Windows Kits\10\bin\10.0.19041.0\x64;%PATH%
set PATH=C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\Hostx64\x64;%PATH%
set LIB=%DPCPP_HOME%\llvm\build\lib;
set LIB=C:\Program Files (x86)\Windows Kits\10\lib\10.0.19041.0\ucrt\x64;%LIB%
set LIB=C:\Program Files (x86)\Windows Kits\10\lib\10.0.19041.0\um\x64;%LIB%
set LIB=C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\lib\x64;%LIB%
cmake E:\SYCL_TEST\ -GNinja -DCMAKE_BUILD_TYPE=Release
cmake --build .
simple-sycl-app-cuda.exe
pause

The .cpp file:

#include <iostream>
#include <array>
#include <algorithm>
#include <CL/sycl.hpp>

namespace sycl = cl::sycl;

const size_t SIZE = 2000000000; 

//to handle errors in kernels
auto handle_async_error = [](sycl::exception_list elist)
{
	for (auto &e : elist)
	{
		try
		{
			std::cout << "async exception" << std::endl;
			std::rethrow_exception(e);
		}
		catch ( sycl::exception& e )
		{
			std::cout << "ASYNC EXCEPTION!!\n";
			std::cout << e.what() << "\n";
		}
	}
};

template<typename Acc>
class IncrementerKernel
{
	public:
	IncrementerKernel(Acc accessor) : accessor(accessor) {}

    void operator() (sycl::item<1> item) const
	{
	  size_t index = item.get_linear_id();
      accessor[index] += 1;
    }

	private:
    Acc accessor;
};

int main(int, char**)
{
	std::cout << "started" << std::endl;
	
	// create source data
	std::vector<int> vals(SIZE);
	std::cout << "data created" << std::endl;
	
	// init data
	for(int x = 0; x < SIZE; x++)
	{
		vals[x] = x;
	}
	std::cout << "vals initialized" << std::endl;

	// get devices
	sycl::device device1 = sycl::device(sycl::gpu_selector());
	sycl::device device2 = sycl::device(sycl::cpu_selector());
	std::cout << "devices selected" << std::endl;
	std::cout << "GPU Device: " << device1.get_info<sycl::info::device::name>() << std::endl;
	std::cout << "CPU Device: " << device2.get_info<sycl::info::device::name>() << std::endl;
	
	// set up queues
	sycl::queue queue1(device1, handle_async_error);
	sycl::queue queue2(device2, handle_async_error);
	std::cout << "queues created" << std::endl;
	
	
	// run kernel on gpu
	{
		sycl::buffer<int, 1> buf(vals.data(), sycl::range<1>(SIZE));
	  
		queue1.submit([&] (sycl::handler& cgh)
		{
			auto acc = buf.get_access<sycl::access::mode::read_write>(cgh);
			
			IncrementerKernel<decltype(acc)> incrementerKernel(acc);

			cgh.parallel_for(sycl::range<1>(SIZE), incrementerKernel);
		});
	}
	std::cout << "gpu kernel execution complete" << std::endl;

	queue1.wait();

	//std::for_each(vals.begin(), vals.end(), [] (int i) { std::cout << i << " "; } );
    //std::cout << std::endl;
	
	// validate last value was incremented
	std::cout << ((vals[SIZE-1] == SIZE) ? "gpu success" : "gpu failure") << std::endl;
	
	
	// run kernel on cpu
	{
		sycl::buffer<int, 1> buf(vals.data(), sycl::range<1>(SIZE));
	  
		queue2.submit([&] (sycl::handler& cgh)
		{
			auto acc = buf.get_access<sycl::access::mode::read_write>(cgh);
			
			IncrementerKernel<decltype(acc)> incrementerKernel(acc);

			cgh.parallel_for(sycl::range<1>(SIZE), incrementerKernel);
		});
	}
	std::cout << "cpu kernel execution complete" << std::endl;

	queue2.wait();

	//std::for_each(vals.begin(), vals.end(), [] (int i) { std::cout << i << " "; } );
    //std::cout << std::endl;
	
	// validate last value was incremented
	std::cout << ((vals[SIZE-1] == SIZE+1) ? "cpu success" : "cpu failure") << std::endl;

    return 0;
}

The verbose output:

E:\SYCL_TEST>del .ninja_deps

E:\SYCL_TEST>del .ninja_log

E:\SYCL_TEST>del build.ninja

E:\SYCL_TEST>del cmake_install.cmake

E:\SYCL_TEST>del CMakeCache.txt

E:\SYCL_TEST>rmdir /s /q CMakeFiles

E:\SYCL_TEST>del simple-sycl-app-cuda.exe

E:\SYCL_TEST>del simple-sycl-app-cuda.ilk
Could Not Find E:\SYCL_TEST\simple-sycl-app-cuda.ilk

E:\SYCL_TEST>del simple-sycl-app-cuda.pdb
Could Not Find E:\SYCL_TEST\simple-sycl-app-cuda.pdb

E:\SYCL_TEST>set DPCPP_HOME=D:\SYCLCompiler

E:\SYCL_TEST>set PATH=D:\SYCLCompiler\llvm\build\bin;

E:\SYCL_TEST>set PATH=C:\Program Files (x86)\Common Files\Intel\Shared Libraries\intel64;D:\SYCLCompiler\llvm\build\bin;

E:\SYCL_TEST>set PATH=C:\Program Files\CMake\bin;C:\Program Files (x86)\Common Files\Intel\Shared Libraries\intel64;D:\SYCLCompiler\llvm\build\bin;

E:\SYCL_TEST>set PATH=C:\Program Files\Ninja;C:\Program Files\CMake\bin;C:\Program Files (x86)\Common Files\Intel\Shared Libraries\intel64;D:\SYCLCompiler\llvm\build\bin;

E:\SYCL_TEST>set PATH=C:\Program Files (x86)\Windows Kits\10\bin\10.0.19041.0\x64;C:\Program Files\Ninja;C:\Program Files\CMake\bin;C:\Program Files (x86)\Common Files\Intel\Shared Libraries\intel64;D:\SYCLCompiler\llvm\build\bin;

E:\SYCL_TEST>set PATH=C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\Hostx64\x64;C:\Program Files (x86)\Windows Kits\10\bin\10.0.19041.0\x64;C:\Program Files\Ninja;C:\Program Files\CMake\bin;C:\Program Files (x86)\Common Files\Intel\Shared Libraries\intel64;D:\SYCLCompiler\llvm\build\bin;

E:\SYCL_TEST>set LIB=D:\SYCLCompiler\llvm\build\lib;

E:\SYCL_TEST>set LIB=C:\Program Files (x86)\Windows Kits\10\lib\10.0.19041.0\ucrt\x64;D:\SYCLCompiler\llvm\build\lib;

E:\SYCL_TEST>set LIB=C:\Program Files (x86)\Windows Kits\10\lib\10.0.19041.0\um\x64;C:\Program Files (x86)\Windows Kits\10\lib\10.0.19041.0\ucrt\x64;D:\SYCLCompiler\llvm\build\lib;

E:\SYCL_TEST>set LIB=C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\lib\x64;C:\Program Files (x86)\Windows Kits\10\lib\10.0.19041.0\um\x64;C:\Program Files (x86)\Windows Kits\10\lib\10.0.19041.0\ucrt\x64;D:\SYCLCompiler\llvm\build\lib;

E:\SYCL_TEST>cmake E:\SYCL_TEST\ -GNinja -DCMAKE_BUILD_TYPE=Release
-- The CXX compiler identification is Clang 15.0.0 with GNU-like command-line
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: D:/SYCLCompiler/llvm/build/bin/clang++.exe - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: E:/SYCL_TEST

E:\SYCL_TEST>cmake --build .
[1/2] Building CXX object CMakeFiles/simple-sycl-app-cuda.dir/simple-sycl-app.cpp.obj
clang version 15.0.0 (https://github.com/intel/llvm 1afa98f26cc8b5d0562f2d4f515764a25dc5574a)
Target: x86_64-pc-windows-msvc
Thread model: posix
InstalledDir: D:\SYCLCompiler\llvm\build\bin
Found CUDA installation: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6, version
clang++: warning: CUDA version is newer than the latest supported version 11.5 [-Wunknown-cuda-version]
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang++.exe" -cc1 -triple x86_64-pc-windows-msvc19.29.30145 -sycl-std=2020 -fsycl-unique-prefix=3f3429d3a9d1c782 -include "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-header-1b7170.h" -dependency-filter "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-header-1b7170.h" -fsycl-enable-int-header-diags -fsycl-is-host -D_MT -D_DLL -Eonly -disable-free -clear-ast-before-backend -main-file-name simple-sycl-app.cpp -mrelocation-model pic -pic-level 2 -mframe-pointer=none -fmath-errno -ffp-contract=on -fno-rounding-math -mconstructor-aliases -funwind-tables=2 -target-cpu x86-64 -tune-cpu generic -v "-fcoverage-compilation-dir=E:\\SYCL_TEST" -resource-dir "D:\\SYCLCompiler\\llvm\\build\\lib\\clang\\15.0.0" -dependency-file "CMakeFiles\\simple-sycl-app-cuda.dir\\simple-sycl-app.cpp.obj.d" -MT CMakeFiles/simple-sycl-app-cuda.dir/simple-sycl-app.cpp.obj -sys-header-deps -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\bin\\..\\include\\sycl" -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\bin\\..\\include" -D NDEBUG -D _DLL -D _MT -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\lib\\clang\\15.0.0\\include" -internal-isystem "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30133\\include" -internal-isystem "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30133\\atlmfc\\include" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\ucrt" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\shared" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\um" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\winrt" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\cppwinrt" -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\bin\\..\\include\\sycl" -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\bin\\..\\include" -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\lib\\clang\\15.0.0\\include" -internal-isystem "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30133\\include" -internal-isystem "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30133\\atlmfc\\include" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\ucrt" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\shared" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\um" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\winrt" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\cppwinrt" -internal-isystem "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.6/include" -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\lib\\clang\\15.0.0\\include" -internal-isystem "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30133\\include" -internal-isystem "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30133\\atlmfc\\include" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\ucrt" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\shared" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\um" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\winrt" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\cppwinrt" -O3 -std=gnu++17 -fdeprecated-macro "-fdebug-compilation-dir=E:\\SYCL_TEST" -ferror-limit 19 -fno-use-cxa-atexit -fms-extensions -fms-compatibility -fms-compatibility-version=19.29.30145 -fdelayed-template-parsing -fcxx-exceptions -fexceptions -vectorize-loops -vectorize-slp --dependent-lib=msvcrt -faddrsig -x c++ E:/SYCL_TEST/simple-sycl-app.cpp
clang -cc1 version 15.0.0 based upon LLVM 15.0.0git default target x86_64-pc-windows-msvc
ignoring duplicate directory "D:\SYCLCompiler\llvm\build\bin\..\include\sycl"
ignoring duplicate directory "D:\SYCLCompiler\llvm\build\bin\..\include"
ignoring duplicate directory "D:\SYCLCompiler\llvm\build\lib\clang\15.0.0\include"
ignoring duplicate directory "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include"
ignoring duplicate directory "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\atlmfc\include"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\ucrt"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\shared"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\um"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\winrt"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\cppwinrt"
ignoring duplicate directory "D:\SYCLCompiler\llvm\build\lib\clang\15.0.0\include"
ignoring duplicate directory "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include"
ignoring duplicate directory "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\atlmfc\include"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\ucrt"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\shared"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\um"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\winrt"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\cppwinrt"
#include "..." search starts here:
#include <...> search starts here:
 D:\SYCLCompiler\llvm\build\bin\..\include\sycl
 D:\SYCLCompiler\llvm\build\bin\..\include
 D:\SYCLCompiler\llvm\build\lib\clang\15.0.0\include
 C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include
 C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\atlmfc\include
 C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\ucrt
 C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\shared
 C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\um
 C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\winrt
 C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\cppwinrt
 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6/include
End of search list.
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang++.exe" -cc1 -triple nvptx64-nvidia-cuda -aux-triple x86_64-pc-windows-msvc -fsycl-is-device -fdeclare-spirv-builtins -fenable-sycl-dae -fms-extensions -fms-compatibility -fdelayed-template-parsing -fms-compatibility-version=19.29.30145 -Wno-sycl-strict "-fsycl-int-header=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-header-1b7170.h" "-fsycl-int-footer=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-footer-49dbc5.h" -sycl-std=2020 -fsycl-unique-prefix=3f3429d3a9d1c782 -emit-llvm-bc -emit-llvm-uselists -disable-free -clear-ast-before-backend -main-file-name simple-sycl-app.cpp -mrelocation-model static -mframe-pointer=all -ffp-contract=on -fno-rounding-math -fno-verbose-asm -no-integrated-as -aux-target-cpu x86-64 -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\bin\\..\\include\\sycl" -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\bin\\..\\include" -mlink-builtin-bitcode "D:\\SYCLCompiler\\llvm\\build\\lib\\clang\\15.0.0\\../../clc\\remangled-l32-signed_char.libspirv-nvptx64--nvidiacl.bc" -mlink-builtin-bitcode "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.6/nvvm/libdevice/libdevice.10.bc" -target-feature +ptx75 -target-sdk-version=11.5 -target-cpu sm_50 -target-feature +ptx42 -mllvm -treat-scalable-fixed-error-as-warning -debugger-tuning=gdb -fno-dwarf-directory-asm -v -v -resource-dir "D:\\SYCLCompiler\\llvm\\build\\lib\\clang\\15.0.0" -dependency-file "CMakeFiles\\simple-sycl-app-cuda.dir\\simple-sycl-app.cpp.obj.d" -MT CMakeFiles/simple-sycl-app-cuda.dir/simple-sycl-app.cpp.obj -MT CMakeFiles/simple-sycl-app-cuda.dir/simple-sycl-app.cpp.obj -sys-header-deps -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\bin\\..\\include\\sycl" -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\bin\\..\\include" -D NDEBUG -D _DLL -D _MT -D NDEBUG -D _DLL -D _MT -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\lib\\clang\\15.0.0\\include" -internal-isystem "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30133\\include" -internal-isystem "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30133\\atlmfc\\include" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\ucrt" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\shared" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\um" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\winrt" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\cppwinrt" -internal-isystem "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.6/include" -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\lib\\clang\\15.0.0\\include" -internal-isystem "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30133\\include" -internal-isystem "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30133\\atlmfc\\include" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\ucrt" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\shared" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\um" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\winrt" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\cppwinrt" -O3 -std=gnu++17 -fdeprecated-macro "-fdebug-compilation-dir=E:\\SYCL_TEST" -ferror-limit 19 -fms-extensions -fms-compatibility -fms-compatibility-version=19.29.30145 -fdelayed-template-parsing -fcxx-exceptions -fexceptions -vectorize-loops -vectorize-slp --dependent-lib=msvcrt --dependent-lib=msvcrt -D__GCC_HAVE_DWARF2_CFI_ASM=1 -o "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-450c78\\simple-sycl-app-sm_50.bc" -x c++ E:/SYCL_TEST/simple-sycl-app.cpp
clang -cc1 version 15.0.0 based upon LLVM 15.0.0git default target x86_64-pc-windows-msvc
ignoring duplicate directory "D:\SYCLCompiler\llvm\build\bin\..\include\sycl"
ignoring duplicate directory "D:\SYCLCompiler\llvm\build\bin\..\include"
ignoring duplicate directory "D:\SYCLCompiler\llvm\build\lib\clang\15.0.0\include"
ignoring duplicate directory "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include"
ignoring duplicate directory "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\atlmfc\include"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\ucrt"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\shared"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\um"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\winrt"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\cppwinrt"
#include "..." search starts here:
#include <...> search starts here:
 D:\SYCLCompiler\llvm\build\bin\..\include\sycl
 D:\SYCLCompiler\llvm\build\bin\..\include
 D:\SYCLCompiler\llvm\build\lib\clang\15.0.0\include
 C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include
 C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\atlmfc\include
 C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\ucrt
 C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\shared
 C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\um
 C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\winrt
 C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\cppwinrt
 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6/include
End of search list.
warning: linking module 'D:\SYCLCompiler\llvm\build\lib\clang\15.0.0\../../clc\remangled-l32-signed_char.libspirv-nvptx64--nvidiacl.bc': Linking two modules of different target triples: 'D:\SYCLCompiler\llvm\build\lib\clang\15.0.0\../../clc\remangled-l32-signed_char.libspirv-nvptx64--nvidiacl.bc' is 'nvptx64-unknown-nvidiacl' whereas 'E:/SYCL_TEST/simple-sycl-app.cpp' is 'nvptx64-nvidia-cuda'
 [-Wlinker-warnings]
1 warning generated.
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang++.exe" -cc1 -triple spir64_x86_64-unknown-unknown -aux-triple x86_64-pc-windows-msvc -fsycl-is-device -fdeclare-spirv-builtins -mllvm -sycl-opt -fenable-sycl-dae -fms-extensions -fms-compatibility -fdelayed-template-parsing -fms-compatibility-version=19.29.30145 -fsycl-instrument-device-code -Wno-sycl-strict "-fsycl-int-header=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-header-1b7170.h" "-fsycl-int-footer=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-footer-49dbc5.h" -sycl-std=2020 -fsycl-unique-prefix=3f3429d3a9d1c782 -Wspir-compat -emit-llvm-bc -emit-llvm-uselists -disable-free -clear-ast-before-backend -main-file-name simple-sycl-app.cpp -mrelocation-model static -mframe-pointer=all -fmath-errno -ffp-contract=on -fno-rounding-math -fno-verbose-asm -mconstructor-aliases -aux-target-cpu x86-64 -mllvm -treat-scalable-fixed-error-as-warning -debugger-tuning=gdb -v -resource-dir "D:\\SYCLCompiler\\llvm\\build\\lib\\clang\\15.0.0" -dependency-file "CMakeFiles\\simple-sycl-app-cuda.dir\\simple-sycl-app.cpp.obj.d" -MT CMakeFiles/simple-sycl-app-cuda.dir/simple-sycl-app.cpp.obj -sys-header-deps -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\bin\\..\\include\\sycl" -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\bin\\..\\include" -D NDEBUG -D _DLL -D _MT -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\lib\\clang\\15.0.0\\include" -internal-isystem "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30133\\include" -internal-isystem "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30133\\atlmfc\\include" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\ucrt" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\shared" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\um" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\winrt" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\cppwinrt" -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\lib\\clang\\15.0.0\\include" -internal-isystem "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30133\\include" -internal-isystem "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30133\\atlmfc\\include" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\ucrt" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\shared" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\um" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\winrt" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\cppwinrt" -O3 -std=gnu++17 -fdeprecated-macro "-fdebug-compilation-dir=E:\\SYCL_TEST" -ferror-limit 19 -fms-extensions -fms-compatibility -fno-threadsafe-statics -fdelayed-template-parsing -fcxx-exceptions -fexceptions -vectorize-loops -vectorize-slp --dependent-lib=msvcrt -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -o "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-f9ca46.bc" -x c++ E:/SYCL_TEST/simple-sycl-app.cpp
clang -cc1 version 15.0.0 based upon LLVM 15.0.0git default target x86_64-pc-windows-msvc
ignoring nonexistent directory "/usr/local/include"
ignoring nonexistent directory "/usr/include"
ignoring duplicate directory "D:\SYCLCompiler\llvm\build\lib\clang\15.0.0\include"
ignoring duplicate directory "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include"
ignoring duplicate directory "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\atlmfc\include"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\ucrt"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\shared"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\um"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\winrt"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\cppwinrt"
ignoring duplicate directory "D:\SYCLCompiler\llvm\build\lib\clang\15.0.0\include"
#include "..." search starts here:
#include <...> search starts here:
 D:\SYCLCompiler\llvm\build\bin\..\include\sycl
 D:\SYCLCompiler\llvm\build\bin\..\include
 D:\SYCLCompiler\llvm\build\lib\clang\15.0.0\include
 C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include
 C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\atlmfc\include
 C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\ucrt
 C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\shared
 C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\um
 C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\winrt
 C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\cppwinrt
End of search list.
 "D:\\SYCLCompiler\\llvm\\build\\bin\\append-file" E:/SYCL_TEST/simple-sycl-app.cpp "--append=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-footer-49dbc5.h" --orig-filename=E:/SYCL_TEST/simple-sycl-app.cpp "--output=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-bbd1bd.cpp" --use-include
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang++.exe" -cc1 -triple x86_64-pc-windows-msvc19.29.30145 -sycl-std=2020 -fsycl-unique-prefix=3f3429d3a9d1c782 -include "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-header-1b7170.h" -dependency-filter "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-header-1b7170.h" -fsycl-enable-int-header-diags -fsycl-is-host -D_MT -D_DLL -emit-obj -mincremental-linker-compatible --mrelax-relocations -disable-free -clear-ast-before-backend -main-file-name simple-sycl-app.cpp -mrelocation-model pic -pic-level 2 -mframe-pointer=none -fmath-errno -ffp-contract=on -fno-rounding-math -mconstructor-aliases -funwind-tables=2 -target-cpu x86-64 -tune-cpu generic -mllvm -treat-scalable-fixed-error-as-warning -v "-fcoverage-compilation-dir=E:\\SYCL_TEST" -resource-dir "D:\\SYCLCompiler\\llvm\\build\\lib\\clang\\15.0.0" -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\bin\\..\\include\\sycl" -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\bin\\..\\include" -iquote E:/SYCL_TEST -D NDEBUG -D _DLL -D _MT -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\lib\\clang\\15.0.0\\include" -internal-isystem "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30133\\include" -internal-isystem "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30133\\atlmfc\\include" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\ucrt" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\shared" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\um" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\winrt" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\cppwinrt" -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\bin\\..\\include\\sycl" -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\bin\\..\\include" -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\lib\\clang\\15.0.0\\include" -internal-isystem "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30133\\include" -internal-isystem "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30133\\atlmfc\\include" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\ucrt" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\shared" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\um" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\winrt" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\cppwinrt" -internal-isystem "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.6/include" -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\lib\\clang\\15.0.0\\include" -internal-isystem "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30133\\include" -internal-isystem "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30133\\atlmfc\\include" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\ucrt" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\shared" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\um" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\winrt" -internal-isystem "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.19041.0\\cppwinrt" -O3 -std=gnu++17 -fdeprecated-macro "-fdebug-compilation-dir=E:\\SYCL_TEST" -ferror-limit 19 -fno-use-cxa-atexit -fms-extensions -fms-compatibility -fms-compatibility-version=19.29.30145 -fdelayed-template-parsing -fcxx-exceptions -fexceptions -vectorize-loops -vectorize-slp --dependent-lib=msvcrt -faddrsig -o "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-78ff7b.o" -x c++ "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-bbd1bd.cpp"
clang -cc1 version 15.0.0 based upon LLVM 15.0.0git default target x86_64-pc-windows-msvc
ignoring duplicate directory "D:\SYCLCompiler\llvm\build\bin\..\include\sycl"
ignoring duplicate directory "D:\SYCLCompiler\llvm\build\bin\..\include"
ignoring duplicate directory "D:\SYCLCompiler\llvm\build\lib\clang\15.0.0\include"
ignoring duplicate directory "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include"
ignoring duplicate directory "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\atlmfc\include"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\ucrt"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\shared"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\um"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\winrt"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\cppwinrt"
ignoring duplicate directory "D:\SYCLCompiler\llvm\build\lib\clang\15.0.0\include"
ignoring duplicate directory "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include"
ignoring duplicate directory "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\atlmfc\include"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\ucrt"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\shared"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\um"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\winrt"
ignoring duplicate directory "C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\cppwinrt"
#include "..." search starts here:
 E:/SYCL_TEST
#include <...> search starts here:
 D:\SYCLCompiler\llvm\build\bin\..\include\sycl
 D:\SYCLCompiler\llvm\build\bin\..\include
 D:\SYCLCompiler\llvm\build\lib\clang\15.0.0\include
 C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include
 C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\atlmfc\include
 C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\ucrt
 C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\shared
 C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\um
 C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\winrt
 C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\cppwinrt
 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6/include
End of search list.
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang-offload-bundler" -type=o -targets=sycl-nvptx64-nvidia-cuda-sm_50,sycl-spir64_x86_64-unknown-unknown,host-x86_64-pc-windows-msvc -output=CMakeFiles/simple-sycl-app-cuda.dir/simple-sycl-app.cpp.obj "-input=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-450c78\\simple-sycl-app-sm_50.bc" "-input=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-f9ca46.bc" "-input=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-78ff7b.o"
[2/2] Linking CXX executable simple-sycl-app-cuda.exe
clang version 15.0.0 (https://github.com/intel/llvm 1afa98f26cc8b5d0562f2d4f515764a25dc5574a)
Target: x86_64-pc-windows-msvc
Thread model: posix
InstalledDir: D:\SYCLCompiler\llvm\build\bin
Found CUDA installation: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6, version
clang-offload-bundler -type=o -targets=sycl-spir64-unknown-unknown -input=CMakeFiles/simple-sycl-app-cuda.dir/simple-sycl-app.cpp.obj -check-section
clang++: warning: CUDA version is newer than the latest supported version 11.5 [-Wunknown-cuda-version]
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang-offload-bundler" -type=o -targets=host-x86_64-pc-windows-msvc,sycl-nvptx64-nvidia-cuda-sm_50,sycl-spir64_x86_64-unknown-unknown -input=CMakeFiles/simple-sycl-app-cuda.dir/simple-sycl-app.cpp.obj "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-efa92a.o" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-221259\\simple-sycl-app-sm_50.o" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-73f701.o" -unbundle -allow-missing-bundles
 "D:\\SYCLCompiler\\llvm\\build\\bin\\llvm-link" "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-221259\\simple-sycl-app-sm_50.o" -o "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-1bdd26\\simple-sycl-app-sm_50.bc" --suppress-warnings
 "D:\\SYCLCompiler\\llvm\\build\\bin\\sycl-post-link" -split=auto -emit-param-info -emit-program-metadata -symbols -emit-exported-symbols -lower-esimd -O3 -spec-const=default -o "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-ef230c\\simple-sycl-app-sm_50.bc" "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-1bdd26\\simple-sycl-app-sm_50.bc"
 "D:\\SYCLCompiler\\llvm\\build\\bin\\file-table-tform" -extract=Code -drop_titles -o "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-d6c512\\simple-sycl-app-sm_50.bc" "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-ef230c\\simple-sycl-app-sm_50.bc"
 "D:\\SYCLCompiler\\llvm\\build\\bin\\llvm-foreach" --out-ext=s "--in-file-list=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-d6c512\\simple-sycl-app-sm_50.bc" "--in-replace=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-d6c512\\simple-sycl-app-sm_50.bc" "--out-file-list=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-4c54ec\\simple-sycl-app-sm_50.s" "--out-replace=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-4c54ec\\simple-sycl-app-sm_50.s" -- "D:\\SYCLCompiler\\llvm\\build\\bin\\clang++.exe" -cc1 -triple nvptx64-nvidia-cuda -aux-triple x86_64-pc-windows-msvc -fsycl-is-device -fdeclare-spirv-builtins -fenable-sycl-dae -fms-extensions -fms-compatibility -fdelayed-template-parsing -fms-compatibility-version=19.29.30145 -Wno-sycl-strict -sycl-std=2020 -S -disable-free -clear-ast-before-backend -main-file-name simple-sycl-app.cpp.obj -mrelocation-model static -mframe-pointer=all -ffp-contract=on -fno-rounding-math -fno-verbose-asm -no-integrated-as -aux-target-cpu x86-64 -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\bin\\..\\include\\sycl" -internal-isystem "D:\\SYCLCompiler\\llvm\\build\\bin\\..\\include" -mlink-builtin-bitcode "D:\\SYCLCompiler\\llvm\\build\\lib\\clang\\15.0.0\\../../clc\\remangled-l32-signed_char.libspirv-nvptx64--nvidiacl.bc" -mlink-builtin-bitcode "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.6/nvvm/libdevice/libdevice.10.bc" -target-feature +ptx75 -target-sdk-version=11.5 -target-cpu sm_50 -target-feature +ptx42 -mllvm -treat-scalable-fixed-error-as-warning -debugger-tuning=gdb -fno-dwarf-directory-asm -v -v -resource-dir "D:\\SYCLCompiler\\llvm\\build\\lib\\clang\\15.0.0" -O3 "-fdebug-compilation-dir=E:\\SYCL_TEST" -ferror-limit 19 -fms-extensions -fms-compatibility -fms-compatibility-version=19.29.30145 -fdelayed-template-parsing -vectorize-loops -vectorize-slp --dependent-lib=msvcrt --dependent-lib=msvcrt -o "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-4c54ec\\simple-sycl-app-sm_50.s" -x ir "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-d6c512\\simple-sycl-app-sm_50.bc"
clang -cc1 version 15.0.0 based upon LLVM 15.0.0git default target x86_64-pc-windows-msvc
 "D:\\SYCLCompiler\\llvm\\build\\bin\\llvm-foreach" --out-ext=o "--in-file-list=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-4c54ec\\simple-sycl-app-sm_50.s" "--in-replace=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-4c54ec\\simple-sycl-app-sm_50.s" "--out-file-list=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-b6c868\\simple-sycl-app-sm_50.o" "--out-replace=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-b6c868\\simple-sycl-app-sm_50.o" -- "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.6/bin\\ptxas" -m64 -O3 -v --gpu-name sm_50 --output-file "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-b6c868\\simple-sycl-app-sm_50.o" "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-4c54ec\\simple-sycl-app-sm_50.s"
ptxas info    : 0 bytes gmem
ptxas info    : Compiling entry function '_ZTS17IncrementerKernelIN2cl4sycl8accessorIiLi1ELNS1_6access4modeE1026ELNS3_6targetE2014ELNS3_11placeholderE0ENS1_3ext6oneapi22accessor_property_listIJEEEEEE' for 'sm_50'
ptxas info    : Function properties for _ZTS17IncrementerKernelIN2cl4sycl8accessorIiLi1ELNS1_6access4modeE1026ELNS3_6targetE2014ELNS3_11placeholderE0ENS1_3ext6oneapi22accessor_property_listIJEEEEEE
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 12 registers, 336 bytes cmem[0]
ptxas info    : Compiling entry function '_ZTSN2cl4sycl6detail18RoundedRangeKernelINS0_4itemILi1ELb1EEELi1E17IncrementerKernelINS0_8accessorIiLi1ELNS0_6access4modeE1026ELNS7_6targetE2014ELNS7_11placeholderE0ENS0_3ext6oneapi22accessor_property_listIJEEEEEEEE_with_offset' for 'sm_50'
ptxas info    : Function properties for _ZTSN2cl4sycl6detail18RoundedRangeKernelINS0_4itemILi1ELb1EEELi1E17IncrementerKernelINS0_8accessorIiLi1ELNS0_6access4modeE1026ELNS7_6targetE2014ELNS7_11placeholderE0ENS0_3ext6oneapi22accessor_property_listIJEEEEEEEE_with_offset
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 8 registers, 356 bytes cmem[0]
ptxas info    : Compiling entry function '_ZTSN2cl4sycl6detail18RoundedRangeKernelINS0_4itemILi1ELb1EEELi1E17IncrementerKernelINS0_8accessorIiLi1ELNS0_6access4modeE1026ELNS7_6targetE2014ELNS7_11placeholderE0ENS0_3ext6oneapi22accessor_property_listIJEEEEEEEE' for 'sm_50'
ptxas info    : Function properties for _ZTSN2cl4sycl6detail18RoundedRangeKernelINS0_4itemILi1ELb1EEELi1E17IncrementerKernelINS0_8accessorIiLi1ELNS0_6access4modeE1026ELNS7_6targetE2014ELNS7_11placeholderE0ENS0_3ext6oneapi22accessor_property_listIJEEEEEEEE
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 8 registers, 344 bytes cmem[0]
 "D:\\SYCLCompiler\\llvm\\build\\bin\\llvm-foreach" --out-ext=fatbin "--in-file-list=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-4c54ec\\simple-sycl-app-sm_50.s" "--in-replace=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-4c54ec\\simple-sycl-app-sm_50.s" "--in-file-list=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-b6c868\\simple-sycl-app-sm_50.o" "--in-replace=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-b6c868\\simple-sycl-app-sm_50.o" "--out-file-list=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-358c4d\\simple-sycl-app-sm_50.fatbin" "--out-replace=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-358c4d\\simple-sycl-app-sm_50.fatbin" -- "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.6/bin\\fatbinary" -64 --create "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-358c4d\\simple-sycl-app-sm_50.fatbin" "--image=profile=compute_50,file=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-4c54ec\\simple-sycl-app-sm_50.s" "--image=profile=sm_50,file=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-b6c868\\simple-sycl-app-sm_50.o"
 "D:\\SYCLCompiler\\llvm\\build\\bin\\file-table-tform" -replace=Code,Code -o "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-8ca2bb\\simple-sycl-app-sm_50.table" "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-ef230c\\simple-sycl-app-sm_50.bc" "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-358c4d\\simple-sycl-app-sm_50.fatbin"
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang-offload-wrapper" "-o=C:\\Users\\matth\\AppData\\Local\\Temp\\wrapper-f3e518.bc" -host=x86_64-pc-windows-msvc -target=nvptx64 -kind=sycl -batch "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-8ca2bb\\simple-sycl-app-sm_50.table"
 "D:\\SYCLCompiler\\llvm\\build\\bin\\llc" -filetype=obj -o "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-cuda-11d9f4\\simple-sycl-app-cuda-sm_50.o" "C:\\Users\\matth\\AppData\\Local\\Temp\\wrapper-f3e518.bc"
 "D:\\SYCLCompiler\\llvm\\build\\bin\\spirv-to-ir-wrapper" "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-73f701.o" -o "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-020d39.bc"
 "D:\\SYCLCompiler\\llvm\\build\\bin\\llvm-link" "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-020d39.bc" -o "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-814bef.bc" --suppress-warnings
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang-offload-bundler" -type=o -targets=sycl-nvptx64-nvidia-cuda-sm_50,sycl-spir64_x86_64-unknown-unknown "-input=D:\\SYCLCompiler\\llvm\\build\\bin/../lib\\libsycl-crt.obj" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-crt-a3b59b.o" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-crt-931917.o" -unbundle -allow-missing-bundles
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang-offload-bundler" -type=o -targets=sycl-nvptx64-nvidia-cuda-sm_50,sycl-spir64_x86_64-unknown-unknown "-input=D:\\SYCLCompiler\\llvm\\build\\bin/../lib\\libsycl-complex.obj" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-complex-80c39c.o" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-complex-36a4a6.o" -unbundle -allow-missing-bundles
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang-offload-bundler" -type=o -targets=sycl-nvptx64-nvidia-cuda-sm_50,sycl-spir64_x86_64-unknown-unknown "-input=D:\\SYCLCompiler\\llvm\\build\\bin/../lib\\libsycl-complex-fp64.obj" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-complex-fp64-f84bbc.o" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-complex-fp64-56e555.o" -unbundle -allow-missing-bundles
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang-offload-bundler" -type=o -targets=sycl-nvptx64-nvidia-cuda-sm_50,sycl-spir64_x86_64-unknown-unknown "-input=D:\\SYCLCompiler\\llvm\\build\\bin/../lib\\libsycl-cmath.obj" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-cmath-fd5e8c.o" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-cmath-9e39e9.o" -unbundle -allow-missing-bundles
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang-offload-bundler" -type=o -targets=sycl-nvptx64-nvidia-cuda-sm_50,sycl-spir64_x86_64-unknown-unknown "-input=D:\\SYCLCompiler\\llvm\\build\\bin/../lib\\libsycl-cmath-fp64.obj" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-cmath-fp64-c86a8f.o" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-cmath-fp64-e43ea3.o" -unbundle -allow-missing-bundles
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang-offload-bundler" -type=o -targets=sycl-nvptx64-nvidia-cuda-sm_50,sycl-spir64_x86_64-unknown-unknown "-input=D:\\SYCLCompiler\\llvm\\build\\bin/../lib\\libsycl-msvc-math.obj" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-msvc-math-fd0092.o" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-msvc-math-a786a0.o" -unbundle -allow-missing-bundles
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang-offload-bundler" -type=o -targets=sycl-nvptx64-nvidia-cuda-sm_50,sycl-spir64_x86_64-unknown-unknown "-input=D:\\SYCLCompiler\\llvm\\build\\bin/../lib\\libsycl-imf.obj" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-imf-1c27c3.o" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-imf-2c9580.o" -unbundle -allow-missing-bundles
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang-offload-bundler" -type=o -targets=sycl-nvptx64-nvidia-cuda-sm_50,sycl-spir64_x86_64-unknown-unknown "-input=D:\\SYCLCompiler\\llvm\\build\\bin/../lib\\libsycl-imf-fp64.obj" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-imf-fp64-afd23c.o" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-imf-fp64-be1a72.o" -unbundle -allow-missing-bundles
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang-offload-bundler" -type=o -targets=sycl-nvptx64-nvidia-cuda-sm_50,sycl-spir64_x86_64-unknown-unknown "-input=D:\\SYCLCompiler\\llvm\\build\\bin/../lib\\libsycl-fallback-cassert.obj" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-fallback-cassert-09d60a.o" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-fallback-cassert-01782e.o" -unbundle -allow-missing-bundles
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang-offload-bundler" -type=o -targets=sycl-nvptx64-nvidia-cuda-sm_50,sycl-spir64_x86_64-unknown-unknown "-input=D:\\SYCLCompiler\\llvm\\build\\bin/../lib\\libsycl-fallback-cstring.obj" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-fallback-cstring-1db3b6.o" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-fallback-cstring-18bfc5.o" -unbundle -allow-missing-bundles
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang-offload-bundler" -type=o -targets=sycl-nvptx64-nvidia-cuda-sm_50,sycl-spir64_x86_64-unknown-unknown "-input=D:\\SYCLCompiler\\llvm\\build\\bin/../lib\\libsycl-fallback-complex.obj" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-fallback-complex-5ff869.o" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-fallback-complex-0cae83.o" -unbundle -allow-missing-bundles
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang-offload-bundler" -type=o -targets=sycl-nvptx64-nvidia-cuda-sm_50,sycl-spir64_x86_64-unknown-unknown "-input=D:\\SYCLCompiler\\llvm\\build\\bin/../lib\\libsycl-fallback-complex-fp64.obj" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-fallback-complex-fp64-9a24f4.o" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-fallback-complex-fp64-798ae1.o" -unbundle -allow-missing-bundles
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang-offload-bundler" -type=o -targets=sycl-nvptx64-nvidia-cuda-sm_50,sycl-spir64_x86_64-unknown-unknown "-input=D:\\SYCLCompiler\\llvm\\build\\bin/../lib\\libsycl-fallback-cmath.obj" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-fallback-cmath-8fa3ce.o" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-fallback-cmath-70625c.o" -unbundle -allow-missing-bundles
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang-offload-bundler" -type=o -targets=sycl-nvptx64-nvidia-cuda-sm_50,sycl-spir64_x86_64-unknown-unknown "-input=D:\\SYCLCompiler\\llvm\\build\\bin/../lib\\libsycl-fallback-cmath-fp64.obj" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-fallback-cmath-fp64-59796c.o" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-fallback-cmath-fp64-276b20.o" -unbundle -allow-missing-bundles
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang-offload-bundler" -type=o -targets=sycl-nvptx64-nvidia-cuda-sm_50,sycl-spir64_x86_64-unknown-unknown "-input=D:\\SYCLCompiler\\llvm\\build\\bin/../lib\\libsycl-fallback-imf.obj" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-fallback-imf-187ff5.o" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-fallback-imf-d1742e.o" -unbundle -allow-missing-bundles
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang-offload-bundler" -type=o -targets=sycl-nvptx64-nvidia-cuda-sm_50,sycl-spir64_x86_64-unknown-unknown "-input=D:\\SYCLCompiler\\llvm\\build\\bin/../lib\\libsycl-fallback-imf-fp64.obj" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-fallback-imf-fp64-f9c774.o" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-fallback-imf-fp64-5a171f.o" -unbundle -allow-missing-bundles
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang-offload-bundler" -type=o -targets=sycl-nvptx64-nvidia-cuda-sm_50,sycl-spir64_x86_64-unknown-unknown "-input=D:\\SYCLCompiler\\llvm\\build\\bin/../lib\\libsycl-itt-user-wrappers.obj" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-itt-user-wrappers-668235.o" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-itt-user-wrappers-bc6c07.o" -unbundle -allow-missing-bundles
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang-offload-bundler" -type=o -targets=sycl-nvptx64-nvidia-cuda-sm_50,sycl-spir64_x86_64-unknown-unknown "-input=D:\\SYCLCompiler\\llvm\\build\\bin/../lib\\libsycl-itt-compiler-wrappers.obj" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-itt-compiler-wrappers-087d51.o" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-itt-compiler-wrappers-f9e95d.o" -unbundle -allow-missing-bundles
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang-offload-bundler" -type=o -targets=sycl-nvptx64-nvidia-cuda-sm_50,sycl-spir64_x86_64-unknown-unknown "-input=D:\\SYCLCompiler\\llvm\\build\\bin/../lib\\libsycl-itt-stubs.obj" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-itt-stubs-b29687.o" "-output=C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-itt-stubs-8e267e.o" -unbundle -allow-missing-bundles
 "D:\\SYCLCompiler\\llvm\\build\\bin\\llvm-link" -only-needed "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-814bef.bc" "C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-crt-931917.o" "C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-complex-36a4a6.o" "C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-complex-fp64-56e555.o" "C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-cmath-9e39e9.o" "C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-cmath-fp64-e43ea3.o" "C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-msvc-math-a786a0.o" "C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-imf-2c9580.o" "C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-imf-fp64-be1a72.o" "C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-fallback-cassert-01782e.o" "C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-fallback-cstring-18bfc5.o" "C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-fallback-complex-0cae83.o" "C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-fallback-complex-fp64-798ae1.o" "C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-fallback-cmath-70625c.o" "C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-fallback-cmath-fp64-276b20.o" "C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-fallback-imf-d1742e.o" "C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-fallback-imf-fp64-5a171f.o" "C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-itt-user-wrappers-bc6c07.o" "C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-itt-compiler-wrappers-f9e95d.o" "C:\\Users\\matth\\AppData\\Local\\Temp\\libsycl-itt-stubs-8e267e.o" -o "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-924618.bc" --suppress-warnings
 "D:\\SYCLCompiler\\llvm\\build\\bin\\sycl-post-link" -split=auto -emit-param-info -symbols -emit-exported-symbols -split-esimd -lower-esimd -O3 -spec-const=default -o "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-057274.table" "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-924618.bc"
 "D:\\SYCLCompiler\\llvm\\build\\bin\\file-table-tform" -extract=Code -drop_titles -o "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-24da45.txt" "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-057274.table"
 "D:\\SYCLCompiler\\llvm\\build\\bin\\llvm-foreach" "--in-file-list=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-24da45.txt" "--in-replace=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-24da45.txt" --out-ext=spv "--out-file-list=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-2546fc.txt" "--out-replace=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-2546fc.txt" -- "D:\\SYCLCompiler\\llvm\\build\\bin\\llvm-spirv" -o "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-2546fc.txt" -spirv-max-version=1.4 -spirv-debug-info-version=ocl-100 -spirv-allow-extra-diexpressions -spirv-allow-unknown-intrinsics=llvm.genx. -spirv-ext=-all,+SPV_EXT_shader_atomic_float_add,+SPV_EXT_shader_atomic_float_min_max,+SPV_KHR_no_integer_wrap_decoration,+SPV_KHR_float_controls,+SPV_KHR_expect_assume,+SPV_KHR_linkonce_odr,+SPV_INTEL_subgroups,+SPV_INTEL_media_block_io,+SPV_INTEL_device_side_avc_motion_estimation,+SPV_INTEL_fpga_loop_controls,+SPV_INTEL_unstructured_loop_controls,+SPV_INTEL_fpga_reg,+SPV_INTEL_blocking_pipes,+SPV_INTEL_function_pointers,+SPV_INTEL_kernel_attributes,+SPV_INTEL_io_pipes,+SPV_INTEL_inline_assembly,+SPV_INTEL_arbitrary_precision_integers,+SPV_INTEL_float_controls2,+SPV_INTEL_vector_compute,+SPV_INTEL_fast_composite,+SPV_INTEL_arbitrary_precision_fixed_point,+SPV_INTEL_arbitrary_precision_floating_point,+SPV_INTEL_variable_length_array,+SPV_INTEL_fp_fast_math_mode,+SPV_INTEL_long_constant_composite,+SPV_INTEL_arithmetic_fence,+SPV_INTEL_token_type,+SPV_INTEL_bfloat16_conversion,+SPV_INTEL_joint_matrix,+SPV_INTEL_hw_thread_queries,+SPV_KHR_uniform_group_instructions "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-24da45.txt"
 "D:\\SYCLCompiler\\llvm\\build\\bin\\llvm-foreach" --out-ext=out "--in-file-list=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-2546fc.txt" "--in-replace=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-2546fc.txt" "--out-file-list=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-95e25e.out" "--out-replace=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-95e25e.out" -- "D:\\SYCLCompiler\\llvm\\build\\bin\\opencl-aot.exe" "-o=C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-95e25e.out" --device=cpu "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-2546fc.txt"
 "D:\\SYCLCompiler\\llvm\\build\\bin\\file-table-tform" -replace=Code,Code -o "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-2665fb.table" "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-057274.table" "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-95e25e.out"
 "D:\\SYCLCompiler\\llvm\\build\\bin\\clang-offload-wrapper" "-o=C:\\Users\\matth\\AppData\\Local\\Temp\\wrapper-ea5649.bc" -host=x86_64-pc-windows-msvc -target=spir64_x86_64 -kind=sycl -batch "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-2665fb.table"
 "D:\\SYCLCompiler\\llvm\\build\\bin\\llc" -filetype=obj -o "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-cuda-33f7c5.o" "C:\\Users\\matth\\AppData\\Local\\Temp\\wrapper-ea5649.bc"
 "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30133\\bin\\Hostx64\\x64\\link.exe" -out:simple-sycl-app-cuda.exe /IGNORE:4078 -nologo /subsystem:console "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-efa92a.o" /MANIFEST:EMBED /implib:simple-sycl-app-cuda.lib /pdb:simple-sycl-app-cuda.pdb /version:0.0 sycl.lib kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib oldnames.lib "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-cuda-11d9f4\\simple-sycl-app-cuda-sm_50.o" "C:\\Users\\matth\\AppData\\Local\\Temp\\simple-sycl-app-cuda-33f7c5.o"

E:\SYCL_TEST>simple-sycl-app-cuda.exe
started
data created
vals initialized
devices selected
GPU Device: NVIDIA GeForce GTX 1080 Ti
CPU Device: Intel(R) Core(TM) i7-6950X CPU @ 3.00GHz
queues created
gpu kernel execution complete
gpu success
cpu kernel execution complete
cpu success

matthewphilipdixon avatar Jun 26 '22 20:06 matthewphilipdixon

Hello, thanks for the report!

I've been looking into this, there was some issues in the clang driver for the Windows-Clang.cmake module to work properly:

  • https://github.com/intel/llvm/pull/6699

And a similar driver issue was breaking the Debug builds:

  • https://github.com/intel/llvm/pull/6721

And I've also tweaked configure.py so we also build lld-link on Windows from now on:

  • https://github.com/intel/llvm/pull/6701

I'll update this ticket again once all of these are merged but I believe they should address all the pain points you've ran into.

Also note the recently added Getting Started section with advice on using DPC++ with CMake:

  • https://intel.github.io/llvm-docs/GetStartedGuide.html#build-dpc-application-with-cmake

npmiller avatar Sep 08 '22 15:09 npmiller

Thanks @npmiller ! I'll test it out once I see all the commits.

matthewphilipdixon avatar Sep 08 '22 17:09 matthewphilipdixon

Quick update to say all the commits are now merged, so it should work with the latest

npmiller avatar Sep 26 '22 08:09 npmiller

I built from the latest but I still have the same issues with cmake. And for some reason, even building with clang directly gives me an executable that can't find my nvidia gpu. I don't see a sycl-ls in the output from building the compiler, am I missing something?

matthewphilipdixon avatar Nov 11 '22 16:11 matthewphilipdixon

Since then we've added support for cupti for the CUDA plugin, it's possible that the cupti libraries are not on your PATH, which would prevent the CUDA plugin from loading correctly.

Something like this should fix it:

set PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\extras\CUPTI\lib64;%PATH%

Adjusted to account for where CUPTI is installed on your system.

npmiller avatar Nov 11 '22 16:11 npmiller

You are correct, that is not in my PATH. Of the following, which times is this needed in the PATH?:

  1. Building the compiler
  2. Using the compiler to target cuda in a build
  3. During runtime of an app that is built targeting cuda

matthewphilipdixon avatar Nov 11 '22 16:11 matthewphilipdixon

For 3, during the runtime (and also for running sycl-ls)

npmiller avatar Nov 11 '22 16:11 npmiller

That did the trick for the clang build. Thanks!

When I use cmake, if I follow your example and use:

set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fsycl")

I get a good build, but it predictably fails to find a gpu device.

When I modify that to:

set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fsycl -fsycl-targets=nvptx64-nvidia-cuda,spir64_x86_64 -v")

It is still not able to find a gpu during runtime, and I'm seeing this error during linking:

D:\SYCLCompiler\llvm\build\bin\clang-offload-bundler.exe -type=o -input=CMakeFiles/simple-sycl-app-cuda.dir/simple-sycl-app.cc.obj -list clang++: warning: linked binaries do not contain expected 'nvptx64-nvidia-cuda' target; found targets: 'nvptx64-nvidia-cuda-sm_50, spir64_x86_64-unknown-unknown' [-Wsycl-target]

I do have the CUPTI path in my PATH for this as well.

Below is my full cmakelists.txt:

cmake_minimum_required(VERSION 3.21.0) set(CMAKE_CXX_COMPILER "clang++") set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fsycl -fsycl-targets=nvptx64-nvidia-cuda,spir64_x86_64 -v") project(simple-sycl-app-cuda LANGUAGES CXX) add_executable(simple-sycl-app-cuda simple-sycl-app.cc)

matthewphilipdixon avatar Nov 11 '22 17:11 matthewphilipdixon

That seems okay, the warning should be safe to ignore as well, it doesn't work properly with CUDA.

Does sycl-ls find the GPU with the %PATH% fixed?

You can also set SYCL_PI_TRACE to 1 to print a little more information on the plugin loading, it would be helpful to see if the CUDA pluging gets loaded properly:

set SYCL_PI_TRACE=1

When running the SYCL application.

And it might also be worth trying with just nvptx64-nvidia-cuda in the -fsycl-targets, multiple targets should work fine, but maybe there's an issue with that.

npmiller avatar Nov 11 '22 17:11 npmiller

Nevermind, typo in script, I was pointing to the wrong version of CUPTI in my cmake build script.

Debug and Release versions are working perfectly in cmake using the cmakelists.txt above.

One thing I noticed that is different from your example on the GetStartedGuide is that I need to include "LANGUAGES CXX" in my project() line in cmakelists.txt or else I get this error during the build:

-- The C compiler identification is MSVC 19.29.30146.0 -- The CXX compiler identification is Clang 16.0.0 with GNU-like command-line -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.29.30133/bin/Hostx64/x64/cl.exe - skipped -- Detecting C compile features -- Detecting C compile features - done CMake Error at C:/Program Files/CMake/share/cmake-3.25/Modules/Platform/Windows-Clang.cmake:170 (message): The current configuration mixes Clang and MSVC or some other CL compatible compiler tool. This is not supported. Use either clang or MSVC as both C, C++ and/or HIP compilers. Call Stack (most recent call first): C:/Program Files/CMake/share/cmake-3.25/Modules/Platform/Windows-Clang.cmake:180 (__verify_same_language_values) C:/Program Files/CMake/share/cmake-3.25/Modules/Platform/Windows-Clang-CXX.cmake:1 (include) C:/Program Files/CMake/share/cmake-3.25/Modules/CMakeCXXInformation.cmake:48 (include) CMakeLists.txt:8 (project)

matthewphilipdixon avatar Nov 11 '22 17:11 matthewphilipdixon

One thing I noticed that is different from your example on the GetStartedGuide is that I need to include "LANGUAGES CXX" in my project() line in cmakelists.txt or else I get this error during the build:

Good catch! Omitting the language doesn't cause any issues on Linux but it does seem troublesome on Windows, I've put up a PR to add the languages to the sample CMakeLists.txt in the docs:

  • https://github.com/intel/llvm/pull/7383

npmiller avatar Nov 14 '22 10:11 npmiller

Since it's working for you now @matthewphilipdixon I'm going to close this ticket, please feel free to re-open it, or open a new one if you have further issues!

npmiller avatar Nov 22 '22 10:11 npmiller