engine RenderPassSsao: improve SSAO blur performance

This PR suggests splitting one large blur pass from "RenderPass based SSAO" into four smaller blur passes without compromising quality.

Main	PR

Tested on MacBook 14" Pro (2021), used device.maxPixelRatio = window.devicePixelRatio; to increase resolution of framebuffers.

I confirm I have read the contributing guidelines and signed the Contributor License Agreement.

Jun 10 '24 21:06 querielo

Hi @querielo - that's definitely a good way to speed it up. But I wonder why doing 4 passes instead of just typical two separable passes?

Jun 11 '24 15:06 mvaligursky

Hi. @mvaligursky

The main idea is that the first two passes weaken the high-frequency signal. The next two passes are used to eliminate low-frequency signal (there are strided/interleaved blurs with a large step, applied diagonally).

Experimentally, it seems that increasing the filter kernel size is necessary to remove low-frequency SSAO patterns. For example, using a 17x17 kernel on my computer results in a noticeable slowdown, but it appears to achieve the effect of four passes.

The suggested four-pass approach is just one way to implement blurring. An advanced developer could add RenderPassDepthAwareBlur to afterPasses themselves, depending on the specific SSAO pattern generated.

4 passes	2 passes, 11x11	2 passes, 17x17

By the low-frequency signal, I mean the pattern you can see in the next image (kernelSize=11)

Jun 11 '24 19:06 querielo

I had a bit of a play with your branch. My findings:

two pass BOX blur with 9 samples seems to be equivalent to the current 9x9 filter, as expected. I assume this is faster.
two pass GAUSSIAN blur with 9 samples is a lot lower quality compared to that - see especially around the yellow torches
four pass you had set up seems to match in quality with the two pass BOX with 9 samples - so I'm not sure we need to use 4 passes.

I think the main reason you need 4 passes is the use of the GAUSSIAN weights instead of BOX.

Jun 12 '24 10:06 mvaligursky

And so my recommendation is: switch it to 2 pass BOX filtering. We can expose the number of taps to the user as that controls the quality vs the cost.

Jun 12 '24 10:06 mvaligursky

@mvaligursky I switched the blur to 2 pass BOX filtering.

Jun 12 '24 11:06 querielo

Offtop: It looks like 6f29e1b005f2ab2a38f75fb45739387561d1d50a breaks the background. Screenshot 2024-06-12 at 13 25 18

Jun 12 '24 11:06 querielo

Offtop: It looks like 6f29e1b breaks the background.

It just got brighter as a result of this https://github.com/playcanvas/engine/pull/6687 I suspect?

Jun 12 '24 11:06 mvaligursky

Looking much better, thanks!

I'd suggest to remove the option / API to change the blur type. Box looks better than Gaussian, and so the SSAO should just use Box. Only expose values that the user would benefit from adjusting.

Jun 17 '24 09:06 mvaligursky

I'd suggest to remove the option / API to change the blur type. Box looks better than Gaussian, and so the SSAO should just use Box. Only expose values that the user would benefit from adjusting.

@mvaligursky Are you certain? Stage: https://engine-m742xowmh-playcanvas.vercel.app/#/graphics/ambient-occlusion To me, it seems that enlarging the kernel size of Box displaces dark areas from their correct position more quickly than hiding SSAO artifacts. However, Gaussian necessitates a larger kernel to conceal the SSAO pattern.

https://github.com/playcanvas/engine/assets/104348270/fdf04009-01de-4ffb-8765-94e13c8a9a20

Jun 17 '24 11:06 querielo

@mvaligursky Are you certain? Stage: https://engine-m742xowmh-playcanvas.vercel.app/#/graphics/ambient-occlusion To me, it seems that enlarging the kernel size of Box displaces dark areas from their correct position more quickly than hiding SSAO artifacts. However, Gaussian necessitates a larger kernel to conceal the SSAO pattern.

yeah interesting. It almost feel like the the bilateral weight function should be improved

float bilateralWeight(in float depth, in float sampleDepth) {
    float diff = (sampleDepth - depth);
    return max(0.0, 1.0 - diff * diff);
}

to detect the depth discontinuity better, as currently it blurs over those small edges. Maybe we can try something like this to have a control over it by the sigma value (untested, would need more research / testing)

float bilateralWeight(in float depth, in float sampleDepth, in float sigma) {
    float diff = (sampleDepth - depth);
    return exp(-diff * diff / (2.0 * sigma * sigma));
}

but agreed, the Gaussian blur in general should be slightly better, at a higher cost, so maybe leave it in.

Jun 17 '24 11:06 mvaligursky

Hi @querielo - have you had some time to look at these?

Jun 24 '24 15:06 mvaligursky

@querielo - please let me know if you have time to finished this PR. I'm keen to continue on some improvements too (https://github.com/playcanvas/engine/issues/6658) but would prefer to avoid larger conflicts with the changes.

If you don't have time, I can take over this PR.

Jun 28 '24 12:06 mvaligursky

@querielo - please let me know if you have time to finished this PR. I'm keen to continue on some improvements too (#6658) but would prefer to avoid larger conflicts with the changes.

If you don't have time, I can take over this PR.

guess thats a no...

Jul 19 '24 13:07 MAG-AdrianMeredith

yep, it's on my list to get to in a week or so. Definitely not forgotten.

Jul 19 '24 13:07 mvaligursky

@mvaligursky Yes, you can take it. Sorry about that. I'm too busy right now and don't have time to fix it.

Jul 21 '24 20:07 querielo

closing this due to https://github.com/playcanvas/engine/pull/6870 Thanks @querielo

Aug 02 '24 12:08 mvaligursky