oneDPL icon indicating copy to clipboard operation
oneDPL copied to clipboard

Implement `direct_iterator` and `make_direct_iterator`

Open BenBrock opened this issue 2 years ago • 6 comments

Implement direct_iterator and make_direct_iterator, which allow users to wrap device iterators that should be used directly inside SYCL kernels by oneDPL. This PR addresses #855 and #854.

This will likely require some work before being accepted, but I just wanted articulate my proposed fix for these issues.

BenBrock avatar Mar 25 '23 00:03 BenBrock

I wonder if providing a wrapper iterator for the purpose of only passing something as-is to oneDPL, is the right approach. Instead, should we maybe follow the approach we use for SYCL buffers, i.e. provide "wrapper" functions that return some object suitable to pass the original class to oneDPL algorithms, without attempting to make it a correct functioning iterator? Or in other words, should we extend the applicability of dpl::begin()/end() beyond SYCL buffers to other containers of interest?

akukanov avatar Mar 28 '23 14:03 akukanov

Here's a better example illustrating why I think this is needed. There are potentially many complicated iterator types users will want to pass directly into oneDPL algorithms, and it's not always possible to identify which ones can and can't be passed in directly.

Suppose you wanted to implement a ranges-style dot product using oneDPL, like below.

template <std::ranges::forward_range X, std::ranges::forward_range Y>
auto dot_product_onedpl(sycl::queue q, X &&x, Y &&y) {
  auto z = std::ranges::views::zip(x, y)
         | std::ranges::views::transform(
           [](auto &&elem) {
             auto &&[a, b] = elem;
             return a * b;
           });

  oneapi::dpl::execution::device_policy policy(q);

  shp::__detail::direct_iterator d_first(z.begin());
  shp::__detail::direct_iterator d_last(z.end());
  return oneapi::dpl::experimental::reduce_async(
             policy, d_first, d_last, std::ranges::range_value_t<X>(0), std::plus())
      .get();
}

The iterator type passed into oneDPL is rather complicated. It's not in general possible to know whether a transform view or a zip view is directly accessible on the device. As a user, I happen to know that X and Y are device ranges, and so I know that the resulting view z can safely be used on the device.

This gets more complicated when you also have user-defined data structures and views.

We will definitely need to keep something like this in our own codebase for distributed ranges. I'll leave it up to you guys to decide whether something like this is more broadly applicable to users. My intention is basically to give users the option of forcing oneDPL algorithms to use device iterators directly on the device. This can be used in cases where either it's not possible to determine whether an iterator can be passed directly (span and a few other views) as well as when complex iterator types make using is_passed_directly a bit unwieldy (most views and some user-defined types).

BenBrock avatar Mar 28 '23 15:03 BenBrock

What's left for driving this pull request to completion? Not being able to use custom device iterators is one of the limitations for us in https://github.com/kokkos/kokkos compared with thrust.

masterleinad avatar Oct 22 '24 20:10 masterleinad

I think the primary blocker is just resources on the oneDPL team. This was going to be merged as part of #1479, but that's been delayed.

Maybe @akukanov, @rarutyun, or @MikeDvorskiy can comment on the possibility of accepting this PR individually to enable libraries like Kokkos?

BenBrock avatar Oct 28 '24 21:10 BenBrock

Maybe @akukanov, @rarutyun, or @MikeDvorskiy can comment on the possibility of accepting this PR individually to enable libraries like Kokkos?

It would probably be already sufficient if is_passed_directly is officially supported although a customization point in the oneapi::dpl namespace would be preferable.

masterleinad avatar Oct 29 '24 13:10 masterleinad

Making is_passed_directly a public customization point in the oneDPL namespace sounds good to me; better than the proposed iterator wrapper.

I suggest to open a RFC discussion at https://github.com/oneapi-src/oneDPL/discussions and/or a design proposal following the process here https://github.com/oneapi-src/oneDPL/tree/main/rfcs. The goal is to have a dedicated design discussion of this idea. The only thing really needed to start is the motivating use cases, but if you have ideas/preferences for how you would customize this trait, that would be useful for the design. I hope it is not too much to ask you for :)

The eventual outcome should be a patch to the oneDPL specification that describes the new functionality, and a patch to this repo that implements it. But after the design is accepted in principle, we will take care of these unless you will want to stay involved.

akukanov avatar Nov 13 '24 20:11 akukanov