llvm icon indicating copy to clipboard operation
llvm copied to clipboard

[SYCL] Implement forward_progress extension API

Open lbushi25 opened this issue 1 year ago • 10 comments

This PR implements the API for the forward progress guarantees extension defined here: https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/proposed/sycl_ext_oneapi_forward_progress.asciidoc ~~It is complete except that it does not verify that the forward progress guarantees requested by a kernel submission are actually supported by the device to which the kernel is submitted. That will be done in a later PR.~~

lbushi25 avatar Apr 12 '24 22:04 lbushi25

It is complete except that it does not verify that the forward progress guarantees requested by a kernel submission are actually supported by the device to which the kernel is submitted. That will be done in a later PR.

Can you elaborate on what part of the spec is not yet implemented? Is it just this statement about throwing an exception that is not yet implemented?

New kernel properties are introduced to allow developers to declare that a given kernel requires specific forward progress guarantees for correctness. If a kernel is submitted to a device that cannot satisfy the request for specific progress guarantees, the implementation must throw an exception with the errc::feature_not_supported error code.

Is there an ETA for when the remaining parts will be implemented? I'm wondering when we want to declare the extension "implemented" and move the extension document to the "experimental" directory.

Also, does this PR also implement sycl_ext_oneapi_launch_queries? Those queries are required in order to correctly use the forward progress guarantee extension.

gmlueck avatar Apr 15 '24 16:04 gmlueck

It is complete except that it does not verify that the forward progress guarantees requested by a kernel submission are actually supported by the device to which the kernel is submitted. That will be done in a later PR.

Can you elaborate on what part of the spec is not yet implemented? Is it just this statement about throwing an exception that is not yet implemented?

New kernel properties are introduced to allow developers to declare that a given kernel requires specific forward progress guarantees for correctness. If a kernel is submitted to a device that cannot satisfy the request for specific progress guarantees, the implementation must throw an exception with the errc::feature_not_supported error code.

Is there an ETA for when the remaining parts will be implemented? I'm wondering when we want to declare the extension "implemented" and move the extension document to the "experimental" directory.

Also, does this PR also implement sycl_ext_oneapi_launch_queries? Those queries are required in order to correctly use the forward progress guarantee extension.

Hi Greg,

Yes, it is the statement about throwing an exception when the progress guarantee cannot be met that is not yet implemented. As for the ETA, I'm working on it in parallel with this PR but decided to not include it in this PR because it looks to me like it will require some refactoring. I will ping you soon about guidance on this.

Finally, no, this PR does not at the moment implement the sycl_ext_oneapi_launch_queries extension but now that you brought it to my attention, I can put it in the next PR along with the verification of kernel properties. I will probably only implement the necessary parts of that extension needed to query progress guarantees so as to not stray too much from the original objective that this PR was created for.

lbushi25 avatar Apr 15 '24 17:04 lbushi25

Yes, it is the statement about throwing an exception when the progress guarantee cannot be met that is not yet implemented. As for the ETA, I'm working on it in parallel with this PR but decided to not include it in this PR because it looks to me like it will require some refactoring. I will ping you soon about guidance on this.

Finally, no, this PR does not at the moment implement the sycl_ext_oneapi_launch_queries extension but now that you brought it to my attention, I can put it in the next PR along with the verification of kernel properties. I will probably only implement the necessary parts of that extension needed to query progress guarantees so as to not stray too much from the original objective that this PR was created for.

Thanks for the update. Let's wait for the next PR before moving the extension document to "experimental".

gmlueck avatar Apr 15 '24 17:04 gmlueck

I've refactored the device_impl.{hpp, cpp} files and implemented verification of the forward progress requirements in the handler.{hpp, cpp}.

lbushi25 avatar Apr 24 '24 20:04 lbushi25

@intel/llvm-reviewers-runtime ping for review

lbushi25 avatar Apr 30 '24 19:04 lbushi25

@intel/llvm-reviewers-runtime second ping as all pre-commit tests are now passing. The PR is somewhat big but 400 lines are tests.

lbushi25 avatar May 02 '24 21:05 lbushi25

@intel/llvm-gatekeepers This is ready for merge.

lbushi25 avatar May 07 '24 17:05 lbushi25

@intel/llvm-gatekeepers This is ready for merge.

Merge button is blocked because of requested changes, explicit approval from @AlexeySachkov is needed.

againull avatar May 07 '24 19:05 againull

@intel/llvm-gatekeepers The PR is now approved. The AWS failures are unrelated to my PR.

lbushi25 avatar May 08 '24 16:05 lbushi25

@intel/llvm-gatekeepers ping

lbushi25 avatar May 08 '24 23:05 lbushi25

This has introduced post-commit errors which are being addressed by https://github.com/intel/llvm/pull/13793/

lbushi25 avatar May 15 '24 21:05 lbushi25