amqp icon indicating copy to clipboard operation
amqp copied to clipboard

"second 'channel.open' seen" error when recovering from consumer crash

Open corben2 opened this issue 1 year ago • 2 comments

When a consumer is actively consuming messages, crashes, and then restarts, the following errors seem to occur:

** (stop) :unexpected_delivery_and_no_default_consumer
Last message: {:consumer_call, {:"basic.deliver", "amq.ctag-GdH1Icq6n6jamBqFqBKZdg", 31, false, "reconnect", "TraceId.#"}, {:amqp_msg, {:P_basic, :undefined, :undefined, :undefined, 2, :undefined, :undefined, :undefined, :undefined, :undefined, :undefined, :undefined, :undefined, :undefined, :undefined}, "spam"}}

14:13:05.066 [info] AMQP channel is gone (sub_chan). Reopening...
2024-07-30T14:13:05.066309+00:00 'Elixir.AMQP.Application.Channel':handle_info/2:153 <0.969.0> [info] AMQP channel is gone (sub_chan). Reopening...

14:13:05.066 [info] starting SelectiveConsumer
2024-07-30 14:13:05.067559+00:00 [error] <0.874.0> Error on AMQP connection <0.874.0> (10.89.1.70:50544 -> 10.89.1.69:5672, vhost: '/', user: 'guest', state: running), channel 1:
2024-07-30 14:13:05.067559+00:00 [error] <0.874.0>  operation channel.open caused a connection exception channel_error: "second 'channel.open' seen"
2024-07-30T14:13:05.068800+00:00 : <0.981.0> [warning] Connection (<0.981.0>) closing: received hard error {'connection.close',504,<<"CHANNEL_ERROR - second 'channel.open' seen">>,20,10} from server

The unexpected_delivery_and_no_default_consumer is expected, I think, but the "second 'channel.open' seen" is not. This causes the connection to close, which breaks all other channels using that connection.

corben2 avatar Jul 30 '24 21:07 corben2

Interesting. Is the queue defined as exclusive?

I guess this is what is happening...

  • the channel process is gone but it is still valid on the server side
  • when amqp tries to open a new channel for the queue, the server returns the "second 'channel.open' seen" error because it thinks there is another channel opened
  • then connection error and amqp_client (erlang lib) fails

Can you open other connections for the other queues then it won't affect other open channels?

ono avatar Sep 09 '24 19:09 ono

We've worked around this by wrapping all of our message consumption in a try-catch. Otherwise, even if the consumer came back properly, I think it would just repeatedly crash on the same message.

The queue is not exclusive.

I believe I originally tried opening other connections (to the same server, however) for the other channels, and it still had the same behavior. I can retry that and get back to you, but like I said above - even if this is fixed - I think there might still be other issues, so maybe we can just close this issue.

corben2 avatar Sep 18 '24 22:09 corben2