sentry-ruby icon indicating copy to clipboard operation
sentry-ruby copied to clipboard

Bad stacktraces with skylight

Open sl0thentr0py opened this issue 3 years ago • 5 comments

Issue Description

Something skylight does is interfering with the backtraces we receive. Investigate why. I already tried swapping middleware orders but it didn't work. Skylight does a bunch of class_eval's so figure out where the problem is.

Reproduction Steps

TODO

Expected Behavior

Stacktraces look ok even with skylight

Actual Behavior

Stacktraces have just one frame and the rest looks messed up.

Ruby Version

2.7

SDK Version

5.2.1

Integration and Its Version

rails

Sentry Config

No response

sl0thentr0py avatar Aug 05 '22 10:08 sl0thentr0py

I think we're running into this. I thought it was because this line: https://github.com/rails/rails/blob/7-0-stable/activesupport/lib/active_support/notifications/instrumenter.rb#L28 loses the stacktrace, but that line hasn't changed in 14 years.

Now that sentry also supports tracing maybe we should drop skylight, but we have some custom instrumentation that I'd have to convert before we can remove the gem.

tinco avatar Jun 06 '23 08:06 tinco

This is an example of a bad backtrace event: https://aeroscan.sentry.io/issues/3954652238/?project=1805098

image

tinco avatar Jun 06 '23 08:06 tinco

@sl0thentr0py I think we're running into this too -- our stack looks like the following:

  • rails (7.0.5)
  • puma (6.2.2)
  • skylight (5.3.4)
  • sentry-rails (5.9.0)
  • sentry-ruby (5.9.0)

As you mentioned, it's just one "good" frame, and then a lot of messed up frames:

Screenshot 2023-06-08 at 11 20 17 AM

The lines I see regarding skylight and sentry in the garbled stack are:

  • skylight (5.3.4) lib/skylight/middleware.rb in call at line 99
  • sentry-rails (5.9.0) lib/sentry/rails/tracing.rb in instrument at line 54
  • skylight (5.3.4) lib/skylight/probes/action_dispatch/request_id.rb in call at line 8

scouttyg avatar Jun 08 '23 15:06 scouttyg

Update: When we disable Skylight's process_middleware.action_dispatch Normalizer (e.g. https://github.com/skylightio/skylight-ruby/blob/master/lib/skylight/normalizers/action_dispatch/process_middleware.rb) it looks like the logs calm down, but I couldn't trace any further than that. Something with what the instrumenter is doing.

cc @zvkemp, perhaps he has some insights here

scouttyg avatar Jun 08 '23 19:06 scouttyg

@sl0thentr0py something to note, but after diving in deeper, I believe it is indeed because Skylight opts into the process_middleware.action_dispatch event, which makes the trace run longer, perhaps above the 250 frames captured from one side of the trace.

Because traces get trimmed after 250, I wonder if this is part of why traces within Sentry become garbled -- without the full stack trace it can't be filtered down to the "essentials" correctly, and so it just ends up maxing at 250, where most of the trace relates to multiple calls to the following (invaluable) lines:

actionpack (7.0.5) lib/action_dispatch/middleware/stack.rb in call at line 60 actionpack (7.0.5) lib/action_dispatch/middleware/host_authorization.rb in call at line 131 actionpack (7.0.5) lib/action_dispatch/middleware/stack.rb in block in call at line 61 activesupport (7.0.5) lib/active_support/notifications.rb in block in instrument at line 206 activesupport (7.0.5) lib/active_support/notifications/instrumenter.rb in instrument at line 24

Perhaps the filtering logic can be updated to filter out these lines / traces in a more robust way?

scouttyg avatar Jun 21 '23 20:06 scouttyg