bee icon indicating copy to clipboard operation
bee copied to clipboard

chain RPC restarts cause CPU utilisation problem

Open significance opened this issue 2 years ago • 3 comments

⚠️ Support requests in an issue-format will be closed immediately. For support, go to Swarm's Discord.

Context

@mfw78 reported

Summary

Found one bee node was at 50% CPU utilisation for some bizarre reason. All the nodes had restarted like 2 days ago when I upgraded Nethermind. Some were stuck at super low depths, just smashing themselves.

significance avatar Apr 08 '23 07:04 significance

It's particular to note - this is what happens when you get into a resource constrained environment. ie. the "starting inertia" is extraordinarily high in bee. But, as the system stabilises, it's inertia drops to near 0.

significance avatar Apr 08 '23 07:04 significance

it would be great to have some logs and metrics from when this sort of thing happens

istae avatar May 08 '23 11:05 istae

I see this a LOT on my massively pinning (1+TB) upload node. The pullsync cursorHandler consume copious amounts of CPU per instance and all new peers hit the newly started node at once. I finally put in a hack to limit the cursors handler to a single concurrent instance, but subsequently increased that limit to 1024 because I also now incorporate a singleflight around the pullsync/cursors database routine so that multiple overlapping requests are served by a single loop of the bins.

I see similar cursorHandler loads on my smaller nodes, but they clear quickly enough to not cause an impact. But if a node has a large local pinned database, it's pretty ugly.

ldeffenb avatar May 08 '23 14:05 ldeffenb