RAJA icon indicating copy to clipboard operation
RAJA copied to clipboard

Nested version of ReduceMinLoc/ReduceMaxLoc would be nice

Open tepperly opened this issue 8 years ago • 8 comments

It would be nice if there was a nested/forallN version of RAJA::ReduceMinLoc/RAJA::ReduceMaxLoc. I have a calculation where I need to find the maximum element and its location in a 3-D array, E(t,y,x). Hence I want to find E_max(t_max,y_max,x_max) >= E(t,y,x) forall t, y, and x. In the case of ties (multiple equivalent maxima), any one of the locations will do.

tepperly avatar May 09 '17 21:05 tepperly

I think we could implement a version where the reducer was templated on both the type of the value and the type of the index - then you could use an "int3" or similar for the index type.

Would that work for your use case?

davidbeckingsale avatar May 09 '17 21:05 davidbeckingsale

Our index types are all defined using RAJA_INDEX_VALUE(TimeInd, "Time Index"), RAJA_INDEX_VALUE(YInd, "Y Spatial Index), etc. Underneath, they are all RAJA::Index_type. We could use something like a int3 or std::tuple.

tepperly avatar May 09 '17 23:05 tepperly

It would be nice to know on what time scale a feature like this might appear (days, weeks, months, or years). If it's days or weeks, I would probably wait. Otherwise, I'll do a work around.

tepperly avatar May 10 '17 18:05 tepperly

Making this work robustly across all backends would take some work (std::tuple doesn't work on the device), so the timescale would be weeks/months.

You can work around using the Layout classes. By calling the toIndices method, you can convert the linear index back to the component pieces.

davidbeckingsale avatar May 10 '17 18:05 davidbeckingsale

At our meeting yesterday, @trws mentioned the complexity of implementing a custom tuple class. However, I think @willkill07 mentioned that he was looking into an implementation that someone had built that would work on a device. For this particular case and potentially others would it make sense to try something simpler like a Index_type template (templated on the actual index type and int ndims)?

rhornung67 avatar May 10 '17 18:05 rhornung67

FYI, here is the CUDA-supported tuple from agency-library/agency

https://github.com/agency-library/agency/blob/master/agency/detail/tuple.hpp

It's not currently stand-alone, but it is 3-clause BSD (same license as RAJA)

willkill07 avatar May 10 '17 21:05 willkill07

There is a tuple in the feature/trws/foralln-reimagined branch as camp::tuple. With the relaxed constexpr flag on nvcc it works well on the device, maybe worth giving that a try @tepperly? That's also the branch where I'm developing what will eventually replace forallN, so you may want to take a look anyway.

trws avatar Aug 19 '17 17:08 trws

I believe we can close this issue, since we have tests for this sort of thing in our test suite

rhornung67 avatar Jan 27 '22 19:01 rhornung67