readthedocs.org icon indicating copy to clipboard operation
readthedocs.org copied to clipboard

Migrate more model's IDs to bigint

Open stsewd opened this issue 3 years ago • 1 comments

These are the tables that can grow quite large (the percent is how many IDs we have consumed from the largest possible, 2^31 - 1):

  • ImportedFile (22%)
  • PageView (9%)
  • SearchQuery (2%)

The other important tables (projects, versions, etc) are less than 1%, so we are fine there.

We already experienced this for the SphinxDomains table (https://github.com/readthedocs/readthedocs.org/pull/9482, https://github.com/readthedocs/readthedocs.org/pull/9483). The migration took around 15 min, and we temporally disabled all access to those models so they won't hang till the migration was completed, the current models are still small, so I think we should be fine without having to temporarily disabling them.

We should also make sure to use a bigint for all new models (django's create app already does this). We can't change the global default since it will change the IDs of existing models, and may require some downtime...

And these are the numbers for .com

  • SphinxDomain (17%)
  • ImportedFile (4%)
  • PageView (1%)
  • SearchQuery (0.18%)
  • Auditlog (0.07%)

stsewd avatar Aug 10 '22 23:08 stsewd

Really good description of the issue 💯

humitos avatar Aug 11 '22 07:08 humitos

The only "big" table that's missing migration is ImportedFile, currently at 32%. Since we are no longer creating a record per each html page, the growth rate should slow down now.

Open PR to migrate that id is at https://github.com/readthedocs/readthedocs.org/pull/9669.

stsewd avatar Sep 27 '23 21:09 stsewd

How to calculate the percent:

max_int = 2**31 - 1
current_id = Model.objects.order_by('id').last().id
current_id * 100 / max_int

stsewd avatar Sep 27 '23 21:09 stsewd