jackrabbit-oak icon indicating copy to clipboard operation
jackrabbit-oak copied to clipboard

OAK-9880 : simplify rgc query

Open stefan-egli opened this issue 3 years ago • 3 comments

stefan-egli avatar Aug 04 '22 14:08 stefan-egli

one concern that just came to mind with this: obviously this would increase the number of queries executed - and that scales with the number of clusterNodeIds in use...

stefan-egli avatar Aug 04 '22 14:08 stefan-egli

Maybe to compensate for that, do less frequent RGC, currently it runs every 5sec by default - changing that to every 1min might not have too much of an impact

stefan-egli avatar Aug 04 '22 14:08 stefan-egli

one concern that just came to mind with this: obviously this would increase the number of queries executed - and that scales with the number of clusterNodeIds in use...

We could address this to some degree by remembering successful operations on minMaxRevTimeInSecs per clusterId. An inactive cluster node will not have its sweep revision updated and minMaxRevTimeInSecs will not change. This would limit operations to the number of active clusterIds.

do less frequent RGC, currently it runs every 5sec by default - changing that to every 1min might not have too much of an impact

I would prefer to keep the interval small. This ensures RGC activity is stretched out more evenly and avoids spikes that impact an application. Every 5 seconds what an initial default set for this kind of continuous RGC. We can certainly discuss a change. Maybe in a separate issue?

mreutegg avatar Aug 05 '22 12:08 mreutegg

  • added logic to avoid calling deleteMany for idle clusterNodeIds - plus tests for this
  • added a test for number of deleteMany calls for the above and some of the existing tests

stefan-egli avatar Aug 24 '22 16:08 stefan-egli

@mreutegg, as I have modified this PR from your earlier review would appreciate a second (final?) review, thx!

stefan-egli avatar Sep 26 '22 14:09 stefan-egli