posthog icon indicating copy to clipboard operation
posthog copied to clipboard

feat(persons): Make `PersonDistinctId.person` nullable

Open tkaemming opened this issue 2 years ago • 0 comments

This is the first step towards fixing the distinct ID reuse problem as described in #20187.

This only contains the schema migration to make the field nullable, and SELECT query changes that needed to be made to accommodate the potential for NULL values in posthog_persondistinctid where they previously were not possible. The change that replaces deletions with updates that set the person field to NULL will happen separately.

Query Updates

The biggest risk here is that I overlooked some read queries that require updates. I am pretty confident in my ability to find plaintext SQL queries that reference posthog_persondistinctid but only somewhat confident in my ability to exhaustively uncover all of the Django ORM queries.

All of the plaintext SELECT queries in plugin-server are qualified by team, but there were a couple in the Django application that needed updating.

For ORM queries, this is what I looked for:

  1. PersonDistinctId model references that were used to build querysets that did not include a reference to person: something like PersonDistinctId.objects.filter(team=team, person__in=…) is fine, but PersonDistinctId.objects.filter(team=team) is not.
  2. persondistinctid_set references on model instances or part of Prefetch: similar to above person.persondistinctid_set would be fine, but team.persondistinctid_set would not be.

tkaemming avatar Feb 06 '24 21:02 tkaemming