store icon indicating copy to clipboard operation
store copied to clipboard

LazyReferenceManager CPU Usage high

Open Bios-Marcel opened this issue 1 year ago • 19 comments

Environment Details

  • EclipseStore Version: 2.0.0
  • JDK version: 21
  • OS: WIndows 11

Describe the bug

When using the default settings for the LazyReferenceManager, it exhausts one CPU core completely.

According to the defaults in the code, the maximum activity (made sure by time budgeting), should be 0,1%.

Reason

It seems that the actual cycles aren't taking that long. Instead the issue seems to stem from calling updateStatistics(). This seems to totally blow the budget. I have confirmed this by copying the LazyReferenceManager implementation and only removing the updateStatistics() method call. CPU usage instantly went to what the docs for the default settings describe.

How to fix it

Given how simple the code looks (its just a loop and some simple ifs basically), I don't think there is much room for optimisation here. Might it be possible to have the cleanup routine track this progress live and make the monitor stateful? This way monitoring would run within the budgeting and won't have to iterate over everything.

Alternatively, I was thinking that it should be possible to disable the statistics altogether if they aren't being accessed anyway.

Bios-Marcel avatar Nov 25 '24 10:11 Bios-Marcel

Could you provide more information? I have launched the big load test but did not observe this behavior.

Could you also share a simple, reproducible example? Additionally, could you include screenshots showing how you detect this behavior on Windows 11?

Here is a screenshot from my performance tests: image

zdenek-jonas avatar Nov 25 '24 13:11 zdenek-jonas

Since I can't share my code, I'd need some time to provide an example here.

The database is roughly 16MB in size and contains only a couple of lazy objects. There are in total 12 known lazy reference instances.

Either way, the effect is visible in both Task Manager and VisualVM.

Task Manager:

image

VisualVM:

CPU Time equals time, meaning the thread is constantly doing compute: image image

The percentage differences between task manager and VisualVM are probably due to how Performance and Efficiency cores are treated.

But yeah, it doesn't really make a lot of sense that it takes this long for mere 12 instances. Does the call to monitor.update trigger any sort of eventing or so? Maybe there's bytecode magic that I don't see by looking at the monitor code. (or I am blind :D)

Bios-Marcel avatar Nov 25 '24 14:11 Bios-Marcel

could you the screenshot of this page from visualvm: image

zdenek-jonas avatar Nov 25 '24 14:11 zdenek-jonas

image image

Would've sent the latter earlier, but VisualVM didn't pick up on the details :D

Bios-Marcel avatar Nov 25 '24 14:11 Bios-Marcel

Normally looked like this:

image

Maybe the NPE stuff was due to using debug mode earlier.

Bios-Marcel avatar Nov 25 '24 14:11 Bios-Marcel

Are you store the data in local disc or some other target?

zdenek-jonas avatar Nov 25 '24 14:11 zdenek-jonas

Please, calll jstack on this process and send me the thread with LazyReferenceManager...

zdenek-jonas avatar Nov 25 '24 14:11 zdenek-jonas

We store inside an H2 database (local disk, no network IO), but I've confirmed this happens when storing to disk directly (without an SQL provider inbetween).

Created with -e -l: jstack.txt

Bios-Marcel avatar Nov 25 '24 14:11 Bios-Marcel

Do you have a change timing on LazyReferenceManager or is all values defaults? Something like this: https://docs.eclipsestore.io/manual/storage/loading-data/lazy-loading/clearing-lazy-references.html

zdenek-jonas avatar Nov 25 '24 15:11 zdenek-jonas

It's all default, I haven't touched anything.

Bios-Marcel avatar Nov 25 '24 15:11 Bios-Marcel

Could you please provide the exact JDK version using java -v?

I currently have no further ideas, but I will try testing on a Windows 11 PC.

zdenek-jonas avatar Nov 25 '24 15:11 zdenek-jonas

PS C:\Users\Schramm> java --version
openjdk 21.0.2 2024-01-16
OpenJDK Runtime Environment (build 21.0.2+13-58)
OpenJDK 64-Bit Server VM (build 21.0.2+13-58, mixed mode, sharing)

Bios-Marcel avatar Nov 25 '24 15:11 Bios-Marcel

Could you please try updating the JDK to version 21.0.5, if possible?

zdenek-jonas avatar Nov 25 '24 16:11 zdenek-jonas

Makes no difference either.

Bios-Marcel avatar Nov 26 '24 15:11 Bios-Marcel

Ok, I cannot simulate it on any of my projects. Please provide some code, and I will try to identify the problems.

zdenek-jonas avatar Nov 26 '24 15:11 zdenek-jonas

I'll provide a comprehensive example once I find the time.

Bios-Marcel avatar Nov 29 '24 11:11 Bios-Marcel

Is it possible that this is exactly the same as what I experienced on https://github.com/eclipse-serializer/serializer/issues/122? I've never been able to reproduce the scenario and haven't had time to test further in the last few months.

But the last thing I did was to find the cause, even if I didn't find how to get into this scenario. But maybe it will help. I guess no one noticed my comments because the issue was already closed.

Could be a simple fix and I would be willing to contribute that simple null check, but till now I am unsure if this really is the appropriate solution. Or if this shouldn't occure but it does because there is an error elsewhere...

ZockiRR avatar Feb 04 '25 17:02 ZockiRR

I've already spent many hours testing in an attempt to simulate this behavior, but so far, without success.

What you can try when it happens:

  • Take multiple Thread Dumps (jstack) in succession and send them to us.
  • Use JFR to create a one-minute snapshot and send it to us, even privately if needed.

zdenek-jonas avatar Feb 04 '25 18:02 zdenek-jonas

I took multiple thread dumps and attached them. I currently got my project running with the reported behaviour. As offered, I have opened a PR, which I currently think will solve the issue. The possibily that the ref is null is checked at other places too, so it doesn't seem to be entirely wrong.

threaddumps.zip

Also it stopped after some time. That is something I didn't observerd till now:

Image

ZockiRR avatar Feb 04 '25 19:02 ZockiRR

I guess this can be closed, given the PR was already merged :)

Bios-Marcel avatar Sep 11 '25 09:09 Bios-Marcel