website icon indicating copy to clipboard operation
website copied to clipboard

Remove Page base-class from RoundPage model

Open jameysharp opened this issue 7 years ago • 3 comments

Perhaps the most important database model in this entire project is RoundPage: it's used in almost every view, and almost every other model has a reference to a RoundPage, at least indirectly.

This model is named RoundPage because it is a subclass of wagtail.wagtailcore.models.Page.

We no longer have any real reason to make rounds be Wagtail pages so we'd like to remove this inheritance relationship. But it isn't obvious to me how to do that.

One approach I considered is to create a new model, say, Round, and write a migration that copies all the existing RoundPage instances into new Round instances. However, that means updating all ForeignKey and similar fields that reference RoundPage, and also updating all uses of the RoundPage class. So that's a huge change and I'm not confident I'd get it right.

Page is not an abstract model, so the database representation here is that RoundPage implicitly has a field equivalent to this:

    page_ptr = models.OneToOneField(Page, parent_link=True)

What I'd really like is a migration that changes the name and type of that one field to the default that Django would create without any inheritance; essentially:

    id = models.AutoField(primary_key=True)

While preserving the values that are already in that field, and telling the database that the next value it should assign to that field needs to be larger than any of the existing values.

My initial attempts to do that didn't work, though, so I'd love to hear from someone if you have any suggestions about how to make that work.

jameysharp avatar Apr 10 '19 22:04 jameysharp

I'd be curious to know what went wrong with the initial attempt, since that might yield some insights, since my first thought on seeing this is to work out step-by-step operations and see which one is the first to fail. If I were doing this I'd probably set up the operations something like:

  1. Explicitly declare the page_ptr field exactly as Django would implicitly add it.
  2. Change the name from page_ptr to id.
  3. Declare id to be primary_key=True.
  4. Remove the inheritance relationship and drop the parent_link=True from id.
  5. Change id to an AutoField (creating the sequence for it if necessary, depending on DB backend).

There's a bit of room for re-ordering there -- the rename in step 2 could be moved to the last step, for example -- but spotting exactly where it fails will provide some guidance on where the problems are (and to be clear, this kind of change is pretty difficult -- it's not the hardest thing to do in migrations, but it's probably on the top-5 list).

Another, more drastic option would be:

  1. Copy/paste RoundPage into a new model with a different but meaningful name, without it inheriting Page, and explicitly declaring a page_ptr field with primary_key=True (since that field implicitly is the primary key). Use the db_table option in Meta to make it use the same table as RoundPage.
  2. Generate migrations for the new model, and fake-apply them so Django thinks everything is up to date.
  3. Make the changes you want in the new model, generating migrations from it instead of from RoundPage, and actually apply those migrations.
  4. Once the DB is in a state you're happy with, generate a DeleteModel migration for RoundPage and fake-apply it, and remove the RoundPage class from the codebase.
  5. Later on when it feels safe to do so, squash the migrations so that only the new model exists in the migration history.

ubernostrum avatar Apr 11 '19 05:04 ubernostrum

I had to do this earlier this year. We made it happen in five steps:

Create nullable fields

We added a migration that creates nullable database fields with new names for each field in the parent model.

class Migration(migrations.Migration):
    operations = [
        migrations.AddField(
            model_name='child',
            name='new_name',
            field=models.CharField(max_length=255, null=True),
        ),
    ]

Move the data across

forwards = '''
    UPDATE myapp_child
       SET new_name = parent.name
      FROM parent
     WHERE myapp_child.parent_ptr_id = parent.id;
'''


class Migration(migrations.Migration):
    operations = [migrations.RunSQL(forwards, migrations.RunSQL.noop)]

We first tried to use Django's update queryset method and F objects, but we never got that to work and fell back to RunSQL.

Detach and delete the parent

Next we broke the connection between the parent model and the child model and dropped the parent data we no longer needed:

from core.migration_helpers import AlterBase


def remove_parent(apps, schema_editor):
    Child = apps.get_model('myapp.Child')
    Parent = apps.get_model('Parent')

    Parent.objects.filter(pk__in=Child.objects.all()).delete()


restore_parent = '''
INSERT INTO parent ( id, name )
     SELECT c.parent_ptr, c.new_name
       FROM myapp_child c
'''


class Migration(migrations.Migration):
    operations = [
        migrations.AlterField(
            model_name='child',
            name='log_ptr',
            field=models.AutoField(primary_key=True),
        ),
        AlterBase(
            model_name='child',
            bases=(models.Model,),
        ),
        migrations.RunPython(remove_parent, migrations.RunPython.noop),
        migrations.RunSQL(migrations.RunSQL.noop, restore_parent),
]

We wrote AlterBase to work around a Django bug: https://code.djangoproject.com/ticket/23521

from django.db.migrations.operations.base import Operation


class AlterBase(Operation):
    """
    Work around lack of support for changing base classes in Django.
    See: https://code.djangoproject.com/ticket/23521
    """
    reduce_to_sql = False
    reversible = True

    def __init__(self, model_name, bases):
        self.model_name = model_name
        self.bases = bases

    def state_forwards(self, app_label, state):
        state.models[app_label, self.model_name].bases = self.bases
        state.reload_model(app_label, self.model_name)

    def database_forwards(self, app_label, schema_editor, from_state, to_state):
        pass

    def database_backwards(self, app_label, schema_editor, from_state, to_state):
        pass

    def describe(self):
        return "Update %s bases to %s" % (self.model_name, self.bases)

Remove nullablility

This step is pretty simple. We remove any null=True we don't want from fields added in step 1. We also standardise the primary key field's configuration to Django's defaults.

class Migration(migrations.Migration):
    operations = [
        migrations.AlterField(
            model_name='child',
            name='new_name',
            field=models.CharField(max_length=255),
        ),
        migrations.AlterField(
            model_name='child',
            name='parent_ptr',
            field=models.AutoField(verbose_name=' ID', serialize=False, auto_created=True, primary_key=True),
        ),
    ]

Rename the fields

Finally we renamed the newly added fields to match the old names:

class Migration(migrations.Migration):
    operations = [
        migrations.RenameField(
            model_name='child', old_name='parent_ptr', new_name='id'
        ),
        migrations.RenameField(
            model_name='child', old_name='new_name', new_name='name'
        ),
    ]

LilyFirefly avatar Apr 11 '19 06:04 LilyFirefly

You may also find https://github.com/django/django/pull/11222 of interest.

LilyFirefly avatar Apr 14 '19 19:04 LilyFirefly