graylog2-server icon indicating copy to clipboard operation
graylog2-server copied to clipboard

Detect if migrations have been executed

Open thll opened this issue 2 years ago • 2 comments

A Graylog cluster only works correctly, if all MongoDB migrations have been executed. There are currently two options how migrations are run:

  1. The leader node of a cluster executes migrations in a dedicated startup phase
  2. Migrations are explicitly triggered by running the graylog migrate command prior to starting any node in the cluster

If a node is started without migrations being run, this is potentially dangerous. This applies to both a pristine Graylog setup or, more importantly after a version upgrade of a Graylog cluster.

There are a couple of scenarios how this could happen:

  1. A non-leader node is started as the first node.
  2. Automatic execution of migrations has been disabled (run_migrations=false) but the graylog migrate command has not been executed prior to starting a node

We should detect this situation and handle it properly, e.g. by delaying node startup until migrations have been executed by a different node or the migration command. We might also abort node startup altogether.

As a detection mechanism we could keep track of the set of migrations that have been executed in MongoDB. Alternatively we could track the software version of the cluster in MongoDB and update it only after migrations have been run successfully. There might be other options.

thll avatar Aug 22 '23 13:08 thll

Do I understand correctly that this work is a requirement to automatic leader election becoming on by default?

tellistone avatar May 17 '24 11:05 tellistone

Do I understand correctly that this work is a requirement to automatic leader election becoming on by default?

@tellistone Yes - and it's the only blocker that I am aware of.

patrickmann avatar May 24 '24 11:05 patrickmann