Victor Lomuller
Victor Lomuller
> Maybe we can add a compiler option to avoid the wrapping i.e. if user sets the flag it promises that all pointers captured inside device code are USM pointers....
> I see at least two directions to explore: > > 1. pass "packed" lambda object instead of original lambda object. "Packed" object should have only "live" data (i.e. referenced...
I guess the SYCL to OpenCL kernel lowering could be improved to handle `std::nullptr_t` which the type captured when your bool condition is false. Otherwise your bug seems to be...
(I'm assuming you are comparing DPC++ against NVCC) I think you are just noticing a general LLVM/NVPTX vs NVCC optimization difference. In your sample, looking at the output of the...
A few points: - I'm not too sure how to create a test for that, I'm happy to try suggestions if you have any - another way to do this...
@steffenlarsen After discussion with Beni, I cut the link to the UR patch (will be caught with another bump). So if you are happy you can approve it, no risk...
@intel/llvm-gatekeepers Ready to merge (failing CI job is unrelated and common to other PRs)
Part of the idea is to allow user to call CUDA device functions from a SYCL kernel. The underlying motivation is actually to have a mode that would support the...
CUDA realies on directives such as `__device__`, `__host__` and `__global__` to drive what goes on the host or the device. OCL doesn't have such things (the specs even has some...
> Despite the fact that in SPIR-V, it does not and cannot work. It can https://registry.khronos.org/SPIR-V/specs/unified1/OpenCL.ExtendedInstructionSet.100.html#printf It is just improperly lowered by the translator. Note: DPCPP is also using an...