Daemon icon indicating copy to clipboard operation
Daemon copied to clipboard

WIP: renderer: micro-optimize CPU culling and other things

Open illwieckz opened this issue 1 year ago • 3 comments

This code is a hot spot, we better avoid computing useless things if we can return early and do bitwise operations instead of relying on the branch prediction being right.

Commits are meant to be squashed, they are just small steps I did one by one for testing I was not breaking anything.

While I'm at it I'm also do minor improvements there and there in tr_world.cpp.

illwieckz avatar May 10 '24 10:05 illwieckz

Before:

20240510-123313-000 unvanquished-orbit

After:

20240510-124016-000 unvanquished-orbit

The engine now spends 7.5% of the time in RE_RenderScene instead of 9%.

BoxOnPlaneSide is also used in other functions (outside of the screenshot).

Edit: This is a release-like RelWithDebInfo build with LTO enabled in both cases.

illwieckz avatar May 10 '24 10:05 illwieckz

The screenshots were done over #1043 because this branch was first written over it, and the code looks to already be bit faster when #1043 is merged:

  • https://github.com/DaemonEngine/Daemon/pull/1043

The current PR is bringing some extra performance boost over it. While the small performance boost of #1043 was not the purpose of that other PR, this one was, and it looks like both are useful for gaining extra CPU performance small steps after small steps.

illwieckz avatar May 10 '24 12:05 illwieckz

I tested this branch rebased over for-0.55.0/sync branch and I also see a speed bump.

Before:

20240512-191032-000 unvanquished-orbit

After:

20240512-191513-000 unvanquished-orbit

So, I'll start to clean-up things and submit the patches for merging in a near future.

Edit: the percentage is higher than the ones in records from previous comments because this time I used a lower graphics preset so less time is spent on other parts of the code.

illwieckz avatar May 12 '24 17:05 illwieckz