opentelemetry-ruby-contrib icon indicating copy to clipboard operation
opentelemetry-ruby-contrib copied to clipboard

Nested client / server spans

Open mwear opened this issue 6 years ago • 7 comments

Given that we have Rack instrumentation as well as Sinatra instrumentation there is the possibility of having nested spans with SpanKind.SERVER. With Faraday and Net::HTTP instrumentation, it's also possible to have nested spans with SpanKind.CLIENT. Ideally, each request will have a single server and client span.

The actual SpanKind will vary depending on the installed instrumentation. We should discuss how to handle this, and update our instrumentation accordingly.

mwear avatar Mar 05 '20 01:03 mwear

@mwear I am trying to find this in the specification but am having trouble tracking this down. Do you happen to have a link handy? My colleague @hmaurer is working on automatic instrumentation for Twirp, which uses Faraday, and we are seeing that the Twirp, Faraday, as well as the driver are generating span.kind=client.

Should we be forcing outer spans to be internal?

Should these augment the current_span kind is already set to client?

What happens in retry cases where the Faraday middleware is retrying multiple times?

arielvalentin avatar Jul 27 '22 15:07 arielvalentin

👋 Hi there. To provide some context on what @arielvalentin alluded to, I have been experimenting with auto-instrumentation for twirp-ruby and ran into some difficulties with enriching the Faraday span of the Twirp client requests. As a temporary work-around I wrapped the Faraday span in a Twirp span following RPC semantic conventions, but as both of these spans have span.kind = client this creates a (self-inflicted) situation similar to the one described by OP.

On a higher-level, @arielvalentin mentioned to me that he believes the specification is blurry as to which spans should have kind server and client, and the more I think about it the more I am inclined to agree. Consider my Twirp use-case: semantic conventions dictate that RPC call spans should have kind = client. The Twirp client uses Faraday to make the network call, which may in turn delegate to another library (e.g. we use https://github.com/excon/excon as the Faraday adapter). To satisfy "every client span should have exactly one server span", we'd need the ability to collapse all the client spans under the RPC span into one (Faraday, Excon).

hmaurer avatar Jul 28 '22 09:07 hmaurer

We've taken the opposite approach in at least one instance before - for koala (which is a thin wrapper around an http client), we actually merge koala-specific attributes down into the HTTP library's span: https://github.com/open-telemetry/opentelemetry-ruby-contrib/blob/26a9b3b7cc6e37f71b29087ee06639aad6cd459a/instrumentation/koala/lib/opentelemetry/instrumentation/koala/patches/instrumentation.rb#L14-L16

I find myself wondering if that's the right approach to this problem in general, or if we should instead prevent http libraries from emitting a separate span in these cases (the context trick we use to collapse down could go the other way just as easily).

ahayworth avatar Jul 28 '22 12:07 ahayworth

I find myself wondering if that's the right approach to this problem in general, or if we should instead prevent http libraries from emitting a separate span in these cases (the context trick we use to collapse down could go the other way just as easily).

Creating a coupling between Twirp and Faraday spans may be OK using a shared context but that creates a few interesting "problems" that I had not thought about before.

  1. What should the InstrumentationScope be? Should it be Twirp or Faraday?
  2. The shared context approach does not provide a way to override the span name. How do we update the Span Name to reflect the templated RPC call?
  3. How do we signal to parent instrumentations that a child instrumentation is enabled so it does not set the span.kind to client or propagate context?
  4. In this case Twirp is conceptually "RPC" although it is really proto over HTTP. What are the appropriate semantics to use here? Should this use RPC and ignore HTTP semantics? Should this use HTTP semantics and treat Twirp as its own namespace? e.g. twirp.rpc?

arielvalentin avatar Jul 28 '22 14:07 arielvalentin

Related discussion here: https://github.com/open-telemetry/opentelemetry-specification/issues/1747

arielvalentin avatar Jul 28 '22 16:07 arielvalentin

Creating a coupling between Twirp and Faraday spans may be OK using a shared context but that creates a few interesting "problems" that I had not thought about before.

Those are all excellent questions. I have answers for none of them. 😆

ahayworth avatar Jul 28 '22 20:07 ahayworth

I think my takeaway from the meeting was that although not ideal, having multiple client spans is not a huge problem and is really on the vendors to sort out. Having twirp instrumentation add client spans should be OK for now.

cc: @mwear @robbkidd

arielvalentin avatar Sep 22 '22 18:09 arielvalentin

👋 This issue has been marked as stale because it has been open with no activity. You can: comment on the issue or remove the stale label to hold stale off for a while, add the keep label to hold stale off permanently, or do nothing. If you do nothing this issue will be closed eventually by the stale bot.

github-actions[bot] avatar Apr 27 '23 01:04 github-actions[bot]