pg_auto_failover icon indicating copy to clipboard operation
pg_auto_failover copied to clipboard

Consider dropping data directory during recovery via pg_basebackup

Open thanodnl opened this issue 4 years ago • 1 comments

When a failover happens, it does not always succeed to pg_rewind the old primary. Instead if has a fallback to recover via pg_basebackup. This is great!

However, once the database becomes bigger in size than 50% of the available diskspace (give or take, inodes could cause other issues) a pg_basebackup might not succeed without operator intervention.

Instead it would be great if pg_auto_failover has an option where, instead of retaining the old database directory, it would delete this directory to ensure enough space is available on the node before initiating a pg_basebackup.

Alternatively we could go as far as designing a tristate for this setting:

  • don't ever delete old directory, that is, until the backup is completely transferred and we swap the directory
  • delete old directory when space is being contested
  • delete old directory before initiating a restore via pg_basebackup

(maybe even a 4th state where the old data directory is retained till manually deleted - or an other failover happens - so we can perform diagnostics on why pg_rewind failed).

For many installations a delete old directory before initiating a restore via pg_basebackup is a very sensible option. If rewind failed pg_auto_failover will copy a fresh copy of the data directory over and configures it as a secondary. This ensures the system always keeps running without an operator needing to ensure enough space is available on the data drive under most circumstances.

thanodnl avatar Dec 15 '21 17:12 thanodnl

See also #853 that lead us to using pg_basebackup tar format (maybe even tar.gz) when fetching the data, prior to swapping it in PGDATA. It makes the reasoning about necessary disk space more complex in a way, because now we might still need to have both the “download” area and the “production” area used at the same time for a while.

DimCitus avatar Dec 15 '21 19:12 DimCitus

Given the following in our function pg_basebackup https://github.com/citusdata/pg_auto_failover/blob/d7997ffc3f1209483a37fe7e8ed49fe7a000f664/src/bin/pg_autoctl/pgctl.c#L1280 I would say that https://github.com/citusdata/pg_auto_failover/pull/870 indeed fixed this.

DimCitus avatar Oct 12 '22 13:10 DimCitus