DirectXShaderCompiler icon indicating copy to clipboard operation
DirectXShaderCompiler copied to clipboard

Reassociate: add global reassociation algorithm

Open lizhengxing opened this issue 1 year ago • 7 comments

This PR pulls the upstream change, Reassociate: add global reassociation algorithm (https://github.com/llvm/llvm-project/commit/b8a330c42ab43879119cd3a305756d28aefe9fe6), into DXC with miminal changes.

For the code below: foo = (a * b) * c bar = (a * d) * c

As the upstream change states, it can identify the a*c is a common factor and redundant.

This is part 1 of the fix for #6593.

lizhengxing avatar May 08 '24 01:05 lizhengxing

What is the performance impact of this change?

dmpots avatar May 08 '24 15:05 dmpots

What is the performance impact of this change?

I haven't run the perf tests. But I believe the impact's smaller than the one I got before. That's why I want to merge the 2 PRs together or merge them consecutively.

lizhengxing avatar May 08 '24 16:05 lizhengxing

What is the performance impact of this change?

I haven't run the perf tests. But I believe the impact's smaller than the one I got before. That's why I want to merge the 2 PRs together or merge them consecutively.

It would be good to get the perf numbers for this change to understand the impact of each part. Also, since it will be possible to run this change by itself with a flag it would be good to understand what the perf looks like.

dmpots avatar May 08 '24 17:05 dmpots

What is the performance impact of this change?

I haven't run the perf tests. But I believe the impact's smaller than the one I got before. That's why I want to merge the 2 PRs together or merge them consecutively.

It would be good to get the perf numbers for this change to understand the impact of each part. Also, since it will be possible to run this change by itself with a flag it would be good to understand what the perf looks like.

There's no flag for this upstream change. I thought to use a flag, but it needs to change the interface of Reassociate Pass to pass in the flag. I'm not sure if it's acceptable. FunctionPass *llvm::createReassociatePass() { return new Reassociate(); }

Anyway, I started to collect the perf number.

lizhengxing avatar May 08 '24 17:05 lizhengxing

What is the performance impact of this change?

I haven't run the perf tests. But I believe the impact's smaller than the one I got before. That's why I want to merge the 2 PRs together or merge them consecutively.

It would be good to get the perf numbers for this change to understand the impact of each part. Also, since it will be possible to run this change by itself with a flag it would be good to understand what the perf looks like.

There's no flag for this upstream change. I thought to use a flag, but it needs to change the interface of Reassociate Pass to pass in the flag. I'm not sure if it's acceptable. FunctionPass *llvm::createReassociatePass() { return new Reassociate(); }

Anyway, I started to collect the perf number.

I don't think we need a flag for this change, but I do think we need a sanity check on the perf numbers that it is an overall positive change. For the flag, I was referring to the flag you added in the other PR (was something like -disable-agressive-reassoc). That flag was not controlling this modification, so I was just asking that we understand the perf of this change in isolation.

dmpots avatar May 08 '24 18:05 dmpots

I'd like to see the perf data for this change.

Synced offline. Looks like a nice win overall in our test suite. Reduced ALU in ~40% of shaders (increased it in ~3%). A small number of shaders had occupancy impacts (~1%). There were an equal number of regressions and improvements.

Overall, this change looks to be positive.

dmpots avatar May 08 '24 23:05 dmpots

I approved the PR because I think this is good, but it would be nice to rebase this on main and put the tests with the other reassociate tests in LLVM.

Done. I rebased this PR on main branch and updated the tests.

lizhengxing avatar May 15 '24 17:05 lizhengxing