node-statsd-client icon indicating copy to clipboard operation
node-statsd-client copied to clipboard

Not safe for running in a node cluster (like PM2)

Open blak3r2 opened this issue 9 years ago • 5 comments

Learned the hardway today that this will crash PM2.

There is a bug in node related to UDP and I believe this is the issue. https://github.com/nodejs/node-v0.x-archive/issues/9261

I do not know why this statsd client causes it but the node-statsd one does not. I wanted to post here before I forget in case it helps others.

blak3r2 avatar Aug 18 '16 20:08 blak3r2

Interesting. We've not seen this edge case with node cluster & UDP itself.

Sorry about that, you should look into using a different statsd client or not using cluster.

At Uber, we do not use cluster.

Raynos avatar Aug 18 '16 23:08 Raynos

I updated the README with a caveat, thanks for pointing this out.

Raynos avatar Aug 18 '16 23:08 Raynos

Just to clarify... their is a way to use UDP... as npm node-statsd doesn't seem to cause this issue (at least anymore). I will dig in deeper when i get time and report back.

Thanks for putting this buffered statsd library together. I definitely had less packets dropped when i was running it (unclustered).

blak3r2 avatar Aug 19 '16 00:08 blak3r2

@blak3r2 looks like the issue is that node-statsd creates a single UDP socket. Where as uber-statsd-client can create multiple UDP sockets, it will actually de-allocate the UDP socket on inactivity and re-allocate it lazily once needed. ( where inactivity, by default is 1000 milliseconds of no UDP writes )

@blak3r2 I suspect that if you use a module like process-reporter ( https://github.com/rf/process-reporter ) which continiously sends stats in an interval, you can "avoid" the issue.

You could also tweak the socket_timeout ( https://github.com/uber/node-statsd-client#optionssocket_timeout ), setting it to 10 minutes or 10 hours should avoid the multiple UDP sockets issue.

Raynos avatar Aug 19 '16 05:08 Raynos

Hi @raynos I just wanted to thank you for your reply!

I am not sure that is quite it because we would have been sending a flood of packets. I am going to setup some further tests this week and will report back.

blak3r2 avatar Aug 22 '16 21:08 blak3r2