posthog icon indicating copy to clipboard operation
posthog copied to clipboard

feat: Always use person_distinct_id_override

Open timgl opened this issue 1 year ago • 8 comments

Problem

For one version of PoE we were still using person_override instead of the correct person_distinct_id_override

Two types of people that would run into this

  1. Anyone who opted in to poev2 in the old settings page
  2. Anyone opting in to this on the new settings page image

This will somewhat change data for these groups. I think this is acceptable as they opted in to this, and it's better to be correct. It will also reduce confusion for anyone now opting in to this, as the new settings don't explain the trade off the old settings did.

Changes

👉 Stay up-to-date with PostHog coding conventions for a smoother review.

Does this work well for both Cloud and self-hosted?

How did you test this code?

timgl avatar May 17 '24 17:05 timgl

I think the base commit of this PR is wrong, so the diff is much larger than it should be @timgl

Twixes avatar May 21 '24 11:05 Twixes

Size Change: +172 B (+0.02%)

Total Size: 1.05 MB

ℹ️ View Unchanged
Filename Size Change
frontend/dist/toolbar.js 1.05 MB +172 B (+0.02%)

compressed-size-action

github-actions[bot] avatar May 21 '24 11:05 github-actions[bot]

@Twixes done!

timgl avatar May 21 '24 11:05 timgl

Looks good to me, but results are different in a couple of tests, not sure why

This also looks good to me in general. I spent a bit of time trying to figure out what was going on with these tests, too.

The issue with posthog/hogql/test/test_modifiers.py::TestModifiers::test_modifiers_persons_on_events_mode_mapping has a pretty straightforward fix: 89773344ed31d3c50d33a6e0fe78b225115135f7

The other test failures are more confusing: I assumed they were because the overrides created in the tests were still being written to the old person_overrides table and fixed that with 55a3179427cde107ada56806221aa86a6b97cb0b, but that doesn't seem to fix anything and actually causes more test failures (7 failures with persons-on-events test flag off - up from 3 if you exclude the modifiers test, and tests then also start failing with the persons-on-events test flag on where they weren't before.)

It seems like these tests are still using the legacy event query. When patched with:

diff --git a/posthog/queries/event_query/event_query.py b/posthog/queries/event_query/event_query.py
index 49d4565a94..56ed872b81 100644
--- a/posthog/queries/event_query/event_query.py
+++ b/posthog/queries/event_query/event_query.py
@@ -134,6 +134,8 @@ class EventQuery(metaclass=ABCMeta):
         return f"{self.DISTINCT_ID_TABLE_ALIAS}.person_id"
 
     def _get_person_ids_query(self, *, relevant_events_conditions: str = "") -> str:
+        raise NotImplementedError("using legacy event query")
+
         if not self._should_join_distinct_ids:
             return ""
 

… running py.test -k person_on_events_v2 results in:

FAILED ee/clickhouse/queries/test/test_paths.py::TestClickhousePaths::test_person_on_events_v2 - NotImplementedError: using legacy event query
FAILED ee/clickhouse/queries/test/test_retention.py::TestClickhouseRetention::test_groups_filtering_person_on_events_v2 - NotImplementedError: using legacy event query
FAILED ee/clickhouse/views/test/test_clickhouse_stickiness.py::TestClickhouseStickiness::test_stickiness_with_person_on_events_v2 - AssertionError: b'{"type":"server_error","code":"error","detail":"A server error occurred.","attr":null}'
FAILED posthog/queries/funnels/test/test_funnel.py::TestFOSSFunnel::test_funnel_events_with_person_on_events_v2 - NotImplementedError: using legacy event query
FAILED posthog/queries/test/test_retention.py::TestFOSSRetention::test_month_interval_with_person_on_events_v2 - NotImplementedError: using legacy event query
FAILED posthog/queries/test/test_trends.py::TestTrends::test_same_day_with_person_on_events_v2 - NotImplementedError: using legacy event query
FAILED posthog/queries/test/test_trends.py::TestTrends::test_same_day_with_person_on_events_v2_latest_override - NotImplementedError: using legacy event query

Not sure what the plan is with updating or replacing these tests to use the new queries, but hopefully this saves some debugging time for you.

Also, once we're not using this table in production queries any longer, I can take care of removing the override worker deployment and other plugin-server bits.

tkaemming avatar May 21 '24 23:05 tkaemming

Ah, all tests in posthog/queries/ use the legacy EventQuery, but they aren't actually being used in production anymore. The new engine lives in posthog/hogql_queries/. But I don't think any of your failures with the raise NotImplementedError("using legacy event query") patch are of tests that fail on this PR. The failing ones are mostly posthog/hogql_queries/, which for sure uses the new table. So I think these differences are real. I don't actually have context on what use_distinct_id_overrides means though, so I think @mariusandra or @timgl probably have a better idea on what's expected and what isn't

Twixes avatar May 22 '24 19:05 Twixes

You're right, and I misspoke here:

I assumed they were because the overrides created in the tests were still being written to the old person_overrides table and fixed that with 55a3179, but that doesn't seem to fix anything

That change does fix something — the errors in posthog/hogql_queries/! But it also also breaks the old tests in posthog/queries/ because the change to use person_distinct_id_overrides instead of person_overrides was never backported to the legacy queries (intentionally.)

To me that seems to suggest a discrepancy in test coverage between the old queries and the new queries (otherwise many more would have been broken here in this PR.) Seems like the tests that break with 55a3179 (the ones listed above) should either be ported to the new system, or deleted to unblock this?

tkaemming avatar May 22 '24 19:05 tkaemming

It looks like the code of hogql-parser has changed since last push, but its version stayed the same at 1.0.8. 👀 Make sure to resolve this in hogql_parser/setup.py before merging!

posthog-bot avatar May 24 '24 13:05 posthog-bot

@tkaemming I fixed the old queries so they use the new overrides too.

timgl avatar May 24 '24 14:05 timgl