oneDPL Implement tag dispatching prototype

Tag dispatching mechanism:

Allows to select parallel pattern to go (with select_backend function)
- Decision is made once basing on the execution policy and Iterator category
- Provides nested tags for the next level dispatching (parallel backend, vectorization, etc.) For example: tbb_backend, is_vector
Patterns are selected based on the tag
- overload with generic tag (customization point)
- overloads for concrete tag types with optimized implementation
Parallel backend as well as vectorized vs non-vectorized bricks are selected basing on the nested tags.

Mar 18 '21 22:03 rarutyun

@brycelelbach, @griwes, @dkolsen-pgi, @rodgert, @ldionne

Hi everyone,

Bryce, David, Michal,

This PR implements prototype (as we see it) that shows feasibility of "Tag dispatching and customization points" to fulfill initial request from you. We decided to show this prototype on oneDPL code base because it has wider set of features and we don't want to miss something.

The approach is shown for replace_if algorithm and the whole find algorithm family and can be extended for others.

Please don't pay attention on the CI. It is red because I removed some things that would be no longer needed when the implementation is complete but this is not the case for now.

Louis, Thomas,

I would like you to also be informed with what's going on because eventually the goal is to implement this tag dispatching prototype in Parallel STL LLVM upstream as was initially requested by Bryce, David and Michal.

All,

Please review and provide the feedback (if any). Please also know that we are planning the architectural meeting with you to discussed the future of Parallel STL LLVM upstream and how this prototype in particular affects that. At the end of the day our goal is to make Parallel Algorithms available as the public API of libc++ as it is right now for libstdc++.

I'll sent the separate email about mentioned architectural meeting.

Thanks.

Apr 29 '21 14:04 rarutyun

@rarutyun Would it be possible to get a simple high level overview of what the agreed-upon tag dispatching mechanism is going to look like? This proposed PR appears to contain quite a few details that get in the way of understanding (of my understanding, at least).

May 04 '21 13:05 ldionne

@rarutyun Would it be possible to get a simple high level overview of what the agreed-upon tag dispatching mechanism is going to look like? This proposed PR appears to contain quite a few details that get in the way of understanding (of my understanding, at least).

@ldionne, Sorry for the late response.

Initial request from Nvidia:

Support multiple back ends (already supported)
Easy extendable for non-standard parallel policies (already supported)
Graph of algorithm dependencies. Means that algorithms are implemented reusing others (already supported)
Support customization points on parallel patterns level for each individual pattern

The last bullet is not supported. The tag dispatching mechanism is implemented to fulfill the request.

Note that customization points should be on the parallel patterns level.

Tag dispatching mechanism:

Allows to select parallel pattern to go (with select_backend function)
- Decision is made once basing on the execution policy and Iterator category
- Provides nested tags for the next level dispatching (parallel backend, vectorization, etc.) For example: tbb_backend, is_vector
Patterns are selected based on the tag
- overload with generic tag (customization point)
- overloads for concrete tag types with optimized implementation
Parallel backend as well as vectorized vs non-vectorized bricks are selected basing on the nested tags.

May 23 '21 13:05 rarutyun