Bad stacktraces with skylight
Issue Description
Something skylight does is interfering with the backtraces we receive. Investigate why.
I already tried swapping middleware orders but it didn't work.
Skylight does a bunch of class_eval's so figure out where the problem is.
Reproduction Steps
TODO
Expected Behavior
Stacktraces look ok even with skylight
Actual Behavior
Stacktraces have just one frame and the rest looks messed up.
Ruby Version
2.7
SDK Version
5.2.1
Integration and Its Version
rails
Sentry Config
No response
I think we're running into this. I thought it was because this line: https://github.com/rails/rails/blob/7-0-stable/activesupport/lib/active_support/notifications/instrumenter.rb#L28 loses the stacktrace, but that line hasn't changed in 14 years.
Now that sentry also supports tracing maybe we should drop skylight, but we have some custom instrumentation that I'd have to convert before we can remove the gem.
This is an example of a bad backtrace event: https://aeroscan.sentry.io/issues/3954652238/?project=1805098
@sl0thentr0py I think we're running into this too -- our stack looks like the following:
-
rails (7.0.5) -
puma (6.2.2) -
skylight (5.3.4) -
sentry-rails (5.9.0) -
sentry-ruby (5.9.0)
As you mentioned, it's just one "good" frame, and then a lot of messed up frames:
The lines I see regarding skylight and sentry in the garbled stack are:
-
skylight (5.3.4) lib/skylight/middleware.rb in call at line 99 -
sentry-rails (5.9.0) lib/sentry/rails/tracing.rb in instrument at line 54 -
skylight (5.3.4) lib/skylight/probes/action_dispatch/request_id.rb in call at line 8
Update: When we disable Skylight's process_middleware.action_dispatch Normalizer (e.g. https://github.com/skylightio/skylight-ruby/blob/master/lib/skylight/normalizers/action_dispatch/process_middleware.rb) it looks like the logs calm down, but I couldn't trace any further than that. Something with what the instrumenter is doing.
cc @zvkemp, perhaps he has some insights here
@sl0thentr0py something to note, but after diving in deeper, I believe it is indeed because Skylight opts into the process_middleware.action_dispatch event, which makes the trace run longer, perhaps above the 250 frames captured from one side of the trace.
Because traces get trimmed after 250, I wonder if this is part of why traces within Sentry become garbled -- without the full stack trace it can't be filtered down to the "essentials" correctly, and so it just ends up maxing at 250, where most of the trace relates to multiple calls to the following (invaluable) lines:
actionpack (7.0.5) lib/action_dispatch/middleware/stack.rb in call at line 60 actionpack (7.0.5) lib/action_dispatch/middleware/host_authorization.rb in call at line 131 actionpack (7.0.5) lib/action_dispatch/middleware/stack.rb in block in call at line 61 activesupport (7.0.5) lib/active_support/notifications.rb in block in instrument at line 206 activesupport (7.0.5) lib/active_support/notifications/instrumenter.rb in instrument at line 24
Perhaps the filtering logic can be updated to filter out these lines / traces in a more robust way?