llvm icon indicating copy to clipboard operation
llvm copied to clipboard

Create one bitcode library for NVPTX

Open MartinWehking opened this issue 1 year ago • 5 comments

Create one single bitcode library for NVPTX by compiling each libdev file into bitcode first, linking these together and running opt on them.

Strip away metadata by reusing prepare_builtins from libclc.

Remove NVPTX bundles from the libdev object files and remove any unbundling action spawned by the Clang driver for the SYCL toolchain when compiling for the NVPTX backend.

Make the driver link against the single bitcode libraries for NVPTX for the SYCL toolchain when device library linkage is not excluded.

Ensure that the clang tests check for the correctness of the new clang driver actions and check if the driver still links the device code against the itt device libraries when device library linkage has been excluded.

Refactor SYCLLibdevice.cmake by creating functions and grouping e.g. the same compilation flags for a filetype together in one variable. Reuse these variables and call functions to remove redundancies.

MartinWehking avatar Aug 13 '24 13:08 MartinWehking

@MartinWehking, why this change is done only for NVPTX target? Can't we make the same change for other targets as well?

bader avatar Aug 13 '24 15:08 bader

@MartinWehking, why this change is done only for NVPTX target? Can't we make the same change for other targets as well?

Hi @bader We also have a patch for the AMD target coming up (#15055). Originally, there was just one patch for both targets (#13930).

It is probably possible to copy the work for the SPIRV target. However, there might be unforeseen issues.

We encountered several issues with enabling it for AMD as well. Therefore, this might be better to be done as future work.

MartinWehking avatar Aug 13 '24 16:08 MartinWehking

I would like @jinge90 to review this and maybe share some thoughts on the discussions above, especially pertaining to doing this for SPIR-V.

steffenlarsen avatar Aug 16 '24 10:08 steffenlarsen

I would like @jinge90 to review this and maybe share some thoughts on the discussions above, especially pertaining to doing this for SPIR-V.

The libdevice cmake change looks good to me. For SPV target, these is no technical block to combine several fallback SPV file into one but I doubt whether it is correct direction. The combined SPV file will lead to more jit overhead for underlying runtime(IGC, OpenCL CPU RT), this may be a performance issue for small kernels. For .obj/.o, these is no such problem since we use "llvm-link -only-needed" to clear away all unnecessary code in user's device image and then bundle the device code into final executable. Thanks very much.

jinge90 avatar Aug 20 '24 02:08 jinge90

@jinge90

That sounds like a reasonable concern. If there are no objections I would leave it as it is for this patch.

MartinWehking avatar Aug 20 '24 10:08 MartinWehking

ping @intel/dpcpp-cfe-reviewers

@steffenlarsen given @jinge90 's input are you happy with this PR? It seems like it still makes sense for Nvidia and AMD but that we maybe shouldn't extend it to SPIR-V. But either way this also gives the CMake a much needed cleanup.

npmiller avatar Aug 29 '24 14:08 npmiller

ping @intel/llvm-gatekeepers to get this merged

MartinWehking avatar Sep 02 '24 10:09 MartinWehking