sentry-python icon indicating copy to clipboard operation
sentry-python copied to clipboard

Auto instrumented Celery tasks have mismatching environment values between the first and second check-ins

Open gaprl opened this issue 1 year ago • 7 comments

How do you use Sentry?

Sentry Saas (sentry.io)

Version

1.39.2

Steps to Reproduce

We've noticed validation errors from our Crons consumer outputting monitors.consumer.monitor_environment_mismatch. This usually happens when the environment for the first check-in is different than the environment for the second check-in.

  1. Create a monitor environment with a different name than our default production, let's say prod
  2. Set up Celery auto instrumentation for and set the environment to prod
  3. Check-ins will be missed

Here's an internal example of a monitor facing this issue.

Expected Result

The check-in should not be missed.

Actual Result

Check-ins were missed when they did run.

CleanShot 2024-02-16 at 15 12 49@2x

gaprl avatar Feb 16 '24 23:02 gaprl

Could either of these not be picking up the environment correctly?

https://github.com/getsentry/sentry-python/blob/4d1b814cfc6764d9556e659327f1bf9008100289/sentry_sdk/integrations/celery.py#L522-L526

https://github.com/getsentry/sentry-python/blob/4d1b814cfc6764d9556e659327f1bf9008100289/sentry_sdk/integrations/celery.py#L557-L563

gaprl avatar Feb 16 '24 23:02 gaprl

Hey @gaprl, I tried to reproduce your issue using this code, following your instructions on the latest version of the Python SDK, but all the check-ins succeeded, as expected, for me.

The first step you listed, however, was a bit unclear to me:

  1. Create a monitor environment with a different name than our default production, let's say prod.

I am not sure what you mean by a "monitor environment." I took this instruction to mean that I should create a monitor in Sentry with the notifications set for the environment "prod," like so:

image

However, if I understand correctly, this setting would only affect when notifications are sent out, not the actual functionality of the monitor itself, which would continue monitoring for all environments.

Could you please clarify what you meant by this step, and also let me know if I am doing anything differently in my code from what you were doing when you encountered this error?

szokeasaurusrex avatar Mar 15 '24 09:03 szokeasaurusrex

@szokeasaurusrex it'd be the environment the check-in would be sent in, and that would be the environment configured for the SDK itself. Like if you set it to "staging".

gaprl avatar Mar 15 '24 18:03 gaprl

@gaprl Yes, I also did that – if you look at the code I linked in the original comment, you will see that I set the environment to "prod"

szokeasaurusrex avatar Mar 15 '24 18:03 szokeasaurusrex

Hey @gaprl, just wanted to follow up on my previous comment. Is this still an issue for you? If it is, can you please provide a reproduction?

szokeasaurusrex avatar Apr 02 '24 12:04 szokeasaurusrex

Hey @szokeasaurusrex apologies for the delay, and I believe we've been able to resolve this issue in the latest bug bash session we did for Crons.

  • https://github.com/getsentry/sentry-python/issues/2757
  • https://github.com/getsentry/sentry-python/issues/2837

gaprl avatar Apr 18 '24 16:04 gaprl

Unfortunately it seems this issue is still occurring. Here's an example monitor 🔒.. I did notice that it mostly happens on once a minute monitors, though.

gaprl avatar Jun 12 '24 16:06 gaprl