RAJAPerf icon indicating copy to clipboard operation
RAJAPerf copied to clipboard

Feature/raja vec

Open vsrana01 opened this issue 5 years ago • 8 comments

Added vectorization in stream/add

vsrana01 avatar May 21 '20 19:05 vsrana01

Perhaps we should take a step back and do a PR for the Lamba changes first? What do you think @vsrana01 ?

ajkunen avatar May 21 '20 20:05 ajkunen

I agree with

Perhaps we should take a step back and do a PR for the Lamba changes first? What do you think @vsrana01 ?

vsrana01 avatar May 21 '20 20:05 vsrana01

@rhornung67 those changes were needed. @ajkunen I agree with you, I can go do another PR for just the Lambda changes and then a second one for the vec stuff. I can test across the different compilers with the new lambda changes and see if there is a performance difference.

vsrana01 avatar May 21 '20 21:05 vsrana01

@vsrana01 and @ajkunen if the lambda args changes are needed for the vectorization stuff, then it may be a good idea to figure out a good way to have both variants (with and without the 'Segs' business) for the non-vector variants. My main concern is that we want to make sure both versions of each kernel perform the same for each compiler. If not, then this is a good place for vendors to mine for why they are not. But, let's not do that now.

I suggest only making additions you need to support the vector variants and leave all non-vector variants as is for now. Does that make sense?

rhornung67 avatar May 21 '20 21:05 rhornung67

@rhornung67 i think the current RAJA develop branch imposes the Lambda requirements, which means the use of the new Lambda notation is necessary. I think if @vsrana01 does a PR for RAJAPerf with just the Lambda changes (and updated RAJA) we can see if there is a performance difference there. After that's complete, then the vector_exec work will be more narrowly scoped, and we can test that performance separately.

But we cant do the vector_exec stuff now without the Lambda stuff.

ajkunen avatar May 21 '20 21:05 ajkunen

@ajkunen and @rhornung67 when I build with adams vectorization branch of RAJA i get compiler errors due to the new Lambda requirements. I will start looking at the changes in performance that the kernels have on them with the new lambda requirements and create a new pr.

vsrana01 avatar May 21 '20 22:05 vsrana01

@vsrana01 and @ajkunen OK. I misunderstood the constraints.

I think it would be best to do a PR with a new variant added (RAJA_Seq_Args) so we can assess performance. Then, move on from there. Agree?

rhornung67 avatar May 21 '20 22:05 rhornung67

@rhornung67 @ajkunen agreed.

vsrana01 avatar May 22 '20 00:05 vsrana01