Fix run_maintenace when no default table exists
We're encountering an error like this when running CALL run_maintenance_proc() on our database with a native range partitioned table without partitions:
psql:.../db/structure.sql:8965: ERROR: null values cannot be formatted as an SQL identifier
CONTEXT: PL/pgSQL function partman.run_maintenance(text,boolean,boolean) line 323 at EXECUTE
SQL statement "SELECT partman.run_maintenance('public.agent_details', p_jobmon := 't')"
PL/pgSQL function partman.run_maintenance_proc(integer,boolean,boolean) line 42 at EXECUTE
DETAIL:
HINT:
CONTEXT: PL/pgSQL function partman.run_maintenance(text,boolean,boolean) line 410 at RAISE
SQL statement "SELECT partman.run_maintenance('public.agent_details', p_jobmon := 't')"
PL/pgSQL function partman.run_maintenance_proc(integer,boolean,boolean) line 42 at EXECUTE
We're using v4.5.1. I traced it to:
https://github.com/pgpartman/pg_partman/blob/v4.5.1/sql/functions/run_maintenance.sql#L317
v_default_tablename is NULL.
That should be set to the parent tablename earlier: https://github.com/pgpartman/pg_partman/blob/v4.5.1/sql/functions/run_maintenance.sql#L168
but v_is_default will be NULL if there are no partitions at all, and (NULL != 'DEFAULT') IS NULL, and IF NULL THEN ... END will not enter the conditional block because NULL is not TRUE, so it doesn't enter the conditional and default to the parent table name.
(NULL IS DISTINCT FROM 'DEFAULT') IS TRUE, however, and works the way it should, and preserves ('any string' IS DISTINCT FROM 'DEFAULT') IS TRUE, while ('DEFAULT' IS DISTINCT FROM 'DEFAULT') IS FALSE.
Sorry I'm not familiar enough with the test suite to write a test. If someone can provide some pointers I'll have a go. I presume it would require adding a test in test/test-id-run-maint.sql?
Just to clarify, you are saying there are no child tables at all in this partition set?
Yes, that’s right.
We remove the default partition so that it doesn’t accumulate unpartitioned data which becomes a problem to partition later. In continuous integration environments with a freshly loaded database with no default partition this means there are no child tables until the maintenance process is run to premake them, for example.
Ok. I wanted to make sure. pg_partman cannot work without having at least one child table. Otherwise it has no idea what the "next" table to create is. The create_parent() function should create at least the premake child tables for that. And retention should never drop the last child table. So how are you getting in the situation where there are no child tables?
We do a schema dump and restore between continuous integration test runs, and ignore child partitions because they make the schema unstable.
How does create_parent know what the "next" table is? I presume it knows what the "first" table is, and in the absence of any "next" table it should create the "first" table?
Sorry, we might be holding it wrong! How would you recommend make a stable, repeatably loadable schema for setting up ephemeral test environments? We're using Ruby on Rails and so have decorated its schema dumper to add some partman commands at the end to try and get things consistently configured. At the moment we're dumping the contents of the partman.part_config table and INSERT-ing it during fresh db load, then running the maintenance proc. I suppose we could change it to use create_parent calls instead, but how would we reverse-engineer the calls to make from the state of the database being dumped?
The issue here is that you're not going to be able to run create_parent() if you're already restoring the contents of the part_config table. You'll get a duplicate key violation because that config already exists. There's nothing special in the part_config table that cares what child tables exist. So, as part of your restore procedure, you could just create any valid child table for that partition set. Might be able to look at calling the relevant create_partition_*() function directly and feeding it one or more values for the p_partition_*[] array parameter. That's actually what create_parent() is doing.
To clarify, create_parent() doesn't need to know the next table. It's creating the initial tables. For time-based, by default it does that based on now(). For ID based it always starts at 0 unless a different start partition is given. It's the run_maintenance() process that requires there to be children. It needs at least one child table so it knows if it needs to make more based on the premake and what those new ones should be.
Another option would be to use the dump_partitioned_table_definition() on the original system to generate the create_parent() call along with an UPDATE statement to restore the remaining configuration options. Note this currently doesn't account for sub-partitioning. You could then just run those statements in your test environment vs restoring the config table contents.
Just checking to see if any of the suggestions i made worked for you?
I won't be merging this PR at this time.
No worries, we'll find another solution.
Another option would be to use the dump_partitioned_table_definition() on the original system to generate the create_parent() call along with an UPDATE statement to restore the remaining configuration options. Note this currently doesn't account for sub-partitioning. You could then just run those statements in your test environment vs restoring the config table contents.
This sounds very promising. Perhaps we can teach the rails schema dumper to do something like this.