bunny icon indicating copy to clipboard operation
bunny copied to clipboard

Enqueueing (recovery aware) publishing

Open michaelklishin opened this issue 12 years ago • 13 comments

Currently Bunny::Exchange#publish doesn't try to ensure the message is sent when network connection is down. We should investigate enqueueing messages locally (possibly with a write-ahead log) and sending them out only when it's safe to do so. Batching can be an option, too.

This is not a feature everybody needs, and it does not replace publisher confirms but it can make it easier for people who cannot afford to lose a single message to build publishers where message loss probability is significantly lower.

Some open questions:

  • If we choose to implement a WAL, what on-disk format should we use?
  • How do we not degrade throughput by a significant amount?
  • How do we synchronize recovery and this publisher? Will it be running in a separate thread?
  • Is an in-memory version practical/interesting? Apps are increasingly deployed into environments (e.g. PaaS) where local disk usage is limited, discouraged or even not an option. Plus memory is a finite resource: do we reject publishes when the buffer is full, keep the most recent N messages or do something else?

michaelklishin avatar Jan 23 '14 07:01 michaelklishin

I don't think we should write to a file on disk, otherwise people who deploy their app on Heroku won't be able to use it. Should make it write to the database or some remote location.

trungpham avatar Sep 17 '14 20:09 trungpham

@trungpham WAL will be optional. Writing to a database largely has the same issues as not having a WAL in the first place.

michaelklishin avatar Sep 18 '14 03:09 michaelklishin

Could still be helpful on Heroku, if it's only an intermediate network issue, and it's able to reconnect before the dyno is restarted, those messages on disk could be republished when the connection is established again.

On Thursday 18 September 2014 at 05:21, Michael Klishin wrote:

@trungpham (https://github.com/trungpham) WAL will be optional. Writing to a database largely has the same issues as not having a WAL in the first place.

— Reply to this email directly or view it on GitHub (https://github.com/ruby-amqp/bunny/issues/184#issuecomment-55990618).

carlhoerberg avatar Sep 19 '14 13:09 carlhoerberg

The problem is that having database adapters is way out of scope for Bunny. A common interface for WAL stores is the only way to go, and with it you can do whatever you want with it.

michaelklishin avatar Sep 19 '14 13:09 michaelklishin

To clarify, i did not advocate database persistence, i meant on-disk persistence, even on Heroku, which is totally possible.

carlhoerberg avatar Sep 19 '14 13:09 carlhoerberg

Ah, interesting. Yeah, I had no idea. Does Heroku documentation cover this?

michaelklishin avatar Sep 19 '14 13:09 michaelklishin

@michaelklishin I vote for a common interface too. That will give us the flexibility of writing to the log file or other data store.

trungpham avatar Sep 19 '14 17:09 trungpham

@michaelklishin yes, here: https://devcenter.heroku.com/articles/dynos#ephemeral-filesystem

carlhoerberg avatar Sep 21 '14 22:09 carlhoerberg

Any update on this issue? We experience sometimes connection issues in our pushing applications (Rails + Unicorn), this means that not all messages are pushed and there is no retry mechanism in place (so the message is lost, gone forever). Really a shame since this prevents us from really using Bunny all over the place.

A start would be to write messages in a file whenver a Timeout/Bunny Connection error happens. If the connection is back up we can reply the file.

JanStevens avatar Jul 12 '16 09:07 JanStevens

@JanStevens no updates. We are working on something very similar in the Objective-C client first, will see how well it works there. That doesn't cover a WAL (writing to durable storage).

michaelklishin avatar Jul 12 '16 12:07 michaelklishin

We also have this problem. If for some reason the connection to RabbitMQ is lost, all messages sent while the connection is down are lost forever. Any update on this topic?

From what I understand using publisher confirms would solve this, right? Is it the recommended way to approach this problem?

rthouvenin avatar Dec 14 '17 16:12 rthouvenin

@rthouvenin there are no updates. This is not a support forum, please start with the docs.

michaelklishin avatar Dec 14 '17 17:12 michaelklishin

Publisher confirms per se do not do what's covered in this issue but they allow the developer to keep track of what messages need re-publishing.

michaelklishin avatar Dec 14 '17 17:12 michaelklishin