Victor Lomuller issues

Results 8 issues of


                                            Victor Lomuller

Global device symbol management proposal.

Some devices can manage global values to push and pull different values. In the CUDA runtime, this is covered by cudaMemcpyToSymbol/cudaMemcpyFromSymbol functions. But other APIs may offer similar functionalities. This...

Update specialization constant to align with the module proposal

Update the specialization constant extension proposal to align with the module proposal. This PR introduce a `kernel_handler` which is an optional argument to parallel_for functors and allow the user to...

Description of the current ambiguities of the SYCL address space deduction rules

The document present the description of the address space representation in SYCL 1.2.1 and then present the list of current ambiguities the address space deduction rules. A concrete proposal on...

[SYCL] Don't set PI_USM_INDIRECT_ACCESS if platform don't support it

If the OpenCL platform doesn't support USM, don't set PI_USM_INDIRECT_ACCESS exec info. This will avoid SYCL program to fail when they don't use USM. If the program do need USM...

[SYCL] Add a CUDA compatibilty mode

This patch enables CUDA mode at the same time as the SYCL mode. This allows the compiler to define CUDA macros and add implicit defines. To enable the mode the...

Hierarchical: LowerWG pass ignores convergent functions

When I compile this code: ``` #include using namespace cl::sycl; int main() { queue q(default_selector().select_device()); auto buf = cl::sycl::buffer(cl::sycl::range(1)); q.submit([&](handler &cgh) { auto globalAcc = buf.get_access(cgh); auto sizeAcc = buf.get_access(cgh);...

bug

Add new launch property to support work_group_static_memory

https://github.com/intel/llvm/pull/15061 introduce a new property `work_group_static_memory` which allow the user to set a given amount of local memory to be used. In order to pass this information to the adaptor,...

Implement work_group_static

The patch partially implements `work_group_static` and update proposal. Implemented: - `work_group_static` to handle static allocation in kernel. - `get_dynamic_work_group_memory` to handle runtime allocation, but only on CUDA `work_group_static` is implemented...