nodejs-pubsub icon indicating copy to clipboard operation
nodejs-pubsub copied to clipboard

Add shutdown options for the client classes

Open feywind opened this issue 1 year ago • 14 comments

We are planning to add updates to the close() methods to allow for a more deterministic shutdown process. This will take care of issues like these:

https://github.com/googleapis/nodejs-pubsub/issues/1860 https://github.com/googleapis/nodejs-pubsub/issues/1856 https://github.com/googleapis/nodejs-pubsub/issues/1665 https://github.com/googleapis/nodejs-pubsub/issues/1648

The three options boil down to this:

  • Wait for everything to finish dequeuing
  • Bail immediately (and maybe return IDs of things that were pending)
  • Nack everything in the queue and wait for those to send

feywind avatar May 02 '24 19:05 feywind

Hey, any update on this, when it's planned to have proper close method?

modestaspruckus avatar Sep 27 '24 09:09 modestaspruckus

Hi @feywind, is there a time estimate planned for this? This is something we need to be able to do a graceful shutdown. Thanks in advance

mcasarrubios avatar Nov 14 '24 17:11 mcasarrubios

For us, the ideal solution would be this point ‘Wait for everything to finish dequeuing’.

mcasarrubios avatar Nov 15 '24 12:11 mcasarrubios

@mcasarrubios Thanks for the input. Right now you can already do the basic "wait for dequeue" using these methods:

  • subscriber.close()
  • topic.flush()

The extra thing we want to add is allowing options for how to handle outstanding data (e.g. drop everything on the floor and quit immediately).

feywind avatar Nov 15 '24 21:11 feywind

Hey @feywind, thank you for your response. Could you please share an example of how to do this? I can't get it to work.

This is the flow I am trying:

1.- Receive a SIGTERM
2.- Call topic.flush() for each topic
3.- Delay 1 second to wait to everything is flushed
4.- Call subscription.close() for each subscription
5.- Wait several seconds to allow the messages to finish their execution and can send the ACK
6.- Continue with the shutdown

Everything works as expected until step 5. When we close the subscriptions no new messages come in for processing. The problem is that the messages that are processing fail when they are going to ACK the message with the following error message: "Error: INVALID : Subscriber closed\n at AckQueue.add..."

mcasarrubios avatar Nov 28 '24 17:11 mcasarrubios

What we'd really like to see is a way to gracefully shut down a subscriber client, without causing a large number of messages to expire and be redelivered. E.g., a way to stop incoming messages and wait for the complete inventory of messages to be drained. This was originally requested in this issue: https://github.com/googleapis/nodejs-pubsub/issues/725.

At our company, we can no longer rely on expiration rates as an indicator of our subscribers' health, since normal activity like scale-downs and deployments cause subscribers (running, in our case, in Kubernetes pods) to shut down, causing a spike in expired messages each time. This obscures our ability to see when messages are expiring due to an unwanted cause.

WesCossick avatar Jan 15 '25 14:01 WesCossick

There's an end in sight for this problem. 😹 The tentative intent is something like the following:

Option 1: Close the subscriber stream, nack everything not sent to a callback, wait for a specified timeout for callbacks and ack/nack queues to empty

Option 2: Just quit immediately, as quickly as possible

This does preclude the "deliver everything in queues to callbacks" option, but I'm guessing nacking would at least make sure that those messages get delivered to another subscriber in a reasonable timeframe.

Is that something that would solve the problems for you?

feywind avatar Apr 30 '25 16:04 feywind

From the information you've shared, option 1 seems like it would be more in line with a graceful shutdown than option 2 (although there's not much detail for that one). The key is that the inventory of messages should be drained gracefully instead of simply letting them expire.

WesCossick avatar Apr 30 '25 19:04 WesCossick

Yeah, I think we're on the same page there. Option 2 is basically "I need my service to exit ASAP", the other is the graceful shutdown path.

feywind avatar May 02 '25 19:05 feywind