opentracing-java Deprecate or remove `ScopeManager.active()`

ScopeManager.active is documented as a feature used to get the current span. Perhaps it is a relic from the previous design. It tempts people to call close on it eventhough ScopeManager is pretty clear that scopes should be managed directly.

I would remove this api, possibly for ScopeManager.activeSpan if that's the intended use case to serve. Otherwise, please clarify the documentation that calls to close are undefined.

If folks are doing things like subclassing, you can deprecate the api, and they can still offer "advanced" things as they exist today as their subtype would still be able to offer that functionality regardless of it it exists on a supertype.

https://github.com/openzipkin-contrib/brave-opentracing/pull/74#issuecomment-380382654

Apr 11 '18 09:04 codefromthecrypt

Hey @adriancole thanks for the feedback.

I think that's a very valid point, and the important question here is whether there is a pattern that requires the Scope to be available in ScopeManager (or a pattern that cannot keep the Scope around it close it itself).

Will think some though on it and try to cook some examples (both a) the need for the current design, if any, and b) the problem of exposing it in ScopeManager)

@yurishkuro @pavolloffay @tylerbenson may have some thoughts on it too ;)

Apr 17 '18 18:04 carlosalberto

I need ScopeManager.active in C# to access the current Scope in scenarios where the library/framework I'd like to instrument only has OnStart/OnStop hooks without any chance to store the scope somewhere.

E.g. .NET has its own OpenTracing "span"-like construct called Activity and I'm translating them to "real" spans with this code: https://github.com/opentracing-contrib/csharp-netcore/blob/master/src/OpenTracing.Contrib.NetCore/Internal/GenericEventProcessor.cs#L43-L67

If I only had access to Tracer.activeSpan I could not close the scope.

Apr 17 '18 20:04 cwe1ss

Hey @cwe1ss thanks for the feedback, definitely appreaciated ;)

Sounds like we should cook such example for the testbed to show that we would need ScopeManager.active() in such scenario, to test things out.

please clarify the documentation that calls to close are undefined.

If we decide to keep this API, then we need to stress this point, definitely, so we avoid unintended problems for the users ;)

Apr 18 '18 12:04 carlosalberto

I'm probably missing something, but I don't understand the problem with that method... Why wouldn't you want a way to get the current scope from the ScopeManager? How would you get it otherwise except from where it was originally created? In fact, we use it in a fair bit of our instrumentation.

When writing instrumentation, it is often useful to know if there is a current active span. We also have additional logic on our scope that wouldn't be as nice to have on the span. I suggest leaving it as is.

May 29 '18 06:05 tylerbenson

I understand @adriancole concern's: Scopes were designed to be explicitly passed around...

The current API is useful when there are no "attributes" where the Scope could be stored.

brave-opentracing does noop on tracer.active().close() which means some instrumentations do not work properly with brave. I would like to understand why it's noop, is there something specific?

I have submitted fix for it https://github.com/openzipkin-contrib/brave-opentracing/pull/83 (if there is an agreement to fix it)

May 29 '18 16:05 pavolloffay

I completely disagree especially in Java about having to retain a lookup table for scope references. this adds debt which turns to permanent debt once you go v1.

can you please show any prior art which does this and why it should be done here? for example knowing there is a current span is different than having a reference to close a scope (especially from another thread!)

can you please engage some language experts if you desire to do this anyway?

May 30 '18 01:05 codefromthecrypt

After discussing with Adrian in a side channel, I am ok with his proposal to have the return type change.

I still think there should be a way to tell if there is an active trace currently happening, so if that is a method that returns the span or the span context and that result being null means no trace, then I guess that will work.

May 30 '18 02:05 tylerbenson

My problem with the ScopeManager.active() active API is that it implies that the same instance is returned as when the Scope has been created and that the finishSpanOnClose flag is remembered.

This makes it impossible to integrate it into the internal "scope manager" of a tracer, which is agnostic of OpenTracing. The only way to make this work in a bridge is to have a separate scope manager for the bridge. But then you can't get the active span when the span was created using the internal API and not the bridge.

In it's current form it's impossible to write a correctly working bridge for tracer.active().close(). So for the Elastic APM OT bridge, this is a noop.

So my suggestion would be to remove the finishSpanOnClose flag or to deprecate ScopeManager.active in favor of ScopeManager.activeSpan.

Jun 01 '18 08:06 felixbarny

folks @pavolloffay seems reluctant to fix the instrumentation he made and wants to force brave to implement this dodgy api instead. Can we please get this removed?

Jun 04 '18 09:06 codefromthecrypt

@codingfabian @raphw are either of you in love with this api that requires implementations to track and try to return scope instances? If you are can you please comment why? If not can you please help get this removed to de-burden implementations?

It is worth mentioning that thinking about this problem from a library pov is important. Agents can fix all sorts of misbehavior, but libaries cannot. So, for example, I'm aware that folks using agents might "shrug off" less than ideal apis as they can wrangle them in mysterious ways. Think about this problem in a way where you can't just fix it via bytecode. I think you'll agree we are better off without it.

Jun 04 '18 09:06 codefromthecrypt

folks @pavolloffay seems reluctant to fix the instrumentation he made and wants to force brave to implement this dodgy api instead. Can we please get this removed?

I am not sure where in the world you found that I am reluctant to fix this issue in instrumentations https://github.com/opentracing-contrib/java-jaxrs/pull/88. What is sure that you are reluctant to fix this in brave https://github.com/openzipkin-contrib/brave-opentracing/pull/83#issuecomment-394284185

Jun 04 '18 09:06 pavolloffay

@pavolloffay your comment was that existing instrumentation are broken. you wrote the instrumentation leading to that issue. fixing the version <1 instrumentation or fixing the <1 api both sort the problem in ways more sustainable than cementing in the malpractice that exists in the tracer. stop pushing design problems on other repos

Jun 04 '18 09:06 codefromthecrypt

Well there is this spec proposal to give access to span id for example, so implementations need to provide a way to access the instance anyway. We currently have no problem with this API.

Jun 04 '18 09:06 CodingFabian

Hey everybody,

So, trying to move on with this topic, after reading the entire thread again, these are my impressions:

Having a Scope instance around can definitely be helpful, for operations other than the usual deactivation/finish of the contained Span (as Tyler had mentioned initially, DataDog's Scope implementation has extra members/operations). Another example is the initial Akka instrumentation, where the Scope also has extra members for keeping track of fire-and-forget operations count (https://github.com/opentracing-contrib/scala-concurrent/blob/master/src/main/scala/io/opentracing/contrib/concurrent/TracedAutoFinishExecutionContext.scala)
Having ScopeManager.active() around can be very helpful when instrumenting libraries that have no context or similar to store the Scope (as @cwe1ss pointed out already). The alternative for those cases is to keep a thread-safe, weak map to store the resulting Scopes (ironically, leading to the very case Adrian mentioned for Brave ;) ).
As Adrian pointed out, having ScopeManager.active().close() easily reachable can be dangerous, as users may indeed end up closing Scope instances they don't own. However, I think this could be handled by improving the documentation and making clear the consequences of bad usage here.
So, the main issue I really see here, is the case of writing bridges to other tracing systems, and make it work just fine with their internal logic (as both @felixbarny and @adriancole pointed out), as keeping the Scope object around on their side can be problematic. I wonder if there is a way to somehow fine a proper, correct, not so complicated way to face this.

I'd say the last point is the important one, which needs to be addressed first, in order to in go in either direction - keep ScopeManager.active() around or drop it.

cc @yurishkuro @tedsuo

Jun 08 '18 04:06 carlosalberto

Hey everybody,

We would like to have a Cross-Language group call on July 13th (next week) and discuss this item as part of it. Please let me know if you guys can make it, specially @adriancole @felixbarny @cwe1ss

Jul 06 '18 18:07 carlosalberto

What was the outcome of your call?

ps on your point about "Having ScopeManager.active() around can be very helpful when instrumenting libraries that have no context or similar to store the Scope" The main concern is creating an api less likely to lead to problems. The call-sites which have no other means to pass a scope are the "infected" call sites, and take the risk without exposing it (risky stuff) by default to all call sites and consumers. In other words, yes call sites like these using thread locals or a utility like brave has, have a chance of leaking. Just the surface area is much smaller and you can easily tell where that problematic code is, because it isn't.. well everywhere.

Jul 21 '18 02:07 codefromthecrypt

The more I think about this, the more I tend to agree with Adrian.

Scope is basically a container for span() that can be close()-ed.
Access to the current scope (over the current span) is therefore only useful if you want to close() it or have some implementation specific extension attached to it.

In my humble opinion:

The close from anywhere is problematic. It should be controlled by the one activating the scope.
Implementation specific could be a sign that something is missing from the core API?
Perhaps the extension should have been applied to Span instead of scope?
If still going implementation specific; why not have a specific way to obtain active scope as well?

What would absolutely break if active scope was replaced by active span and couldn't be solved otherwise?

Background: I made an adapter between my 'general purpose' context propagation mechanism and the OpenTracing ScopeManager and it was non-trivial when to close and not close the scope.

Jul 21 '18 07:07 sjoerdtalsma

@adriancole no real conclusions. I addressed the problems with ScopeManager.active() again and also proposed to remove the finishSpanOnClose flag. I've just created an issue about that here: https://github.com/opentracing/opentracing-java/issues/291.

Jul 23 '18 07:07 felixbarny

Are there any use cases which require to get the Scope from the ScopeManager as opposed to the active Span? If so, could we collect some code samples for that?

Jul 24 '18 08:07 felixbarny

@felixbarny I think the case @cwe1ss pointed out some time ago still stands:

I need ScopeManager.active in C# to access the current Scope in scenarios where the library/framework I'd like to instrument only has OnStart/OnStop hooks without any chance to store the scope somewhere. ... If I only had access to Tracer.activeSpan I could not close the scope.

https://github.com/opentracing-contrib/csharp-netcore/blob/master/src/OpenTracing.Contrib.NetCore/Internal/GenericEventProcessor.cs#L43-L67

Jul 25 '18 18:07 carlosalberto

Hey @sjoerdtalsma

The close from anywhere is problematic. It should be controlled by the one activating the scope.

I think this could be easily 'enforced' by indeed not exposing Scope but Span in ScopeManager.

Perhaps the extension should have been applied to Span instead of scope?

So this is something that could be tried with some experimental API. We could need to think how to handle the separation between activation and lifetime in Span - maybe completely separated methods for finishing and deactivating ... In that regard, the opentracing-testbed section could be used to try out such new API.

Jul 25 '18 18:07 carlosalberto

@carlosalberto

The close from anywhere is problematic. It should be controlled by the one activating the scope.

I think this could be easily 'enforced' by indeed not exposing Scope but Span in ScopeManager.

@adriancole Sounds like the point of your request? Remove active Scope in favour of an active Span ?

Jul 25 '18 18:07 sjoerdtalsma

I'm not sure why there is a concept of removing active scope in favor of span.. why this is somehow experimental. This was in use in brave and also census for quite a long time. Retrieving the scope is the thing that causes issues. As mentioned in the initial description ScopeManager.activeSpan or similar is fine by me.

I don't know why there is cause to stall something like this with a testbed notion, but as it is already stalled for a while if you feel you need to test the idea of only returning active span, be my guest!

Jul 27 '18 12:07 codefromthecrypt

Hey @adriancole

I'm not sure why there is a concept of removing active scope in favor of span.. why this is somehow experimental. This was in use in brave and also census for quite a long time. Retrieving the scope is the thing that causes issues

So, if I understood correctly, you would like ScopeManager.active() to go in favor of ScopeManager.activeSpan() and keep the notion of Scope (through startActive()) otherwise? That sounds fine (by me, at least ;) )

So the testbed notion is precisely for testing how API changes could impact different use cases. The change for active() -> activeSpan() is not that complex, so it wouldn't require as many iterations in that regard (and I can do a PR myself for that); but if we decide to try to change the Scope notion, well, that would require more effort on that front, for good or bad ;)

Nevertheless, I can definitely help with that as needed.

Jul 27 '18 16:07 carlosalberto

@carlosalberto so this is still an issue.. I can see you last explained yourself are you waiting for me or someone to say go?

I just recently had to do a lot of work to avoid holding up classloaders due to scenarios such as this issue. If you are planning to address this, can you go ahead and proceed? Thanks.

Sep 13 '18 06:09 codefromthecrypt

Hey @adriancole

Sorry, had been busy doing other work ;) So the plan is to work tomorrow on updating the PR for #291 and have something ready for testing soon.

Sep 13 '18 14:09 carlosalberto

so far my experience with opentracing in Java has been somewhat less than stellar particularly around this api and in particular the decision to track scopes with threadlocals.

Due to heavy use of executor pools and async things I've found out of the box scope management causes many mem leaks in my case. So in some cases where I know I'm in the same thread but just want to hand over context without polluting my code, I've been trialing creating an activeSpan and then immediately closing in the callee but leaving the span open - using the threadlocal in managed short bursts. this however requires great care and is still likely to come unstuck, but at least the option exists. without scopeManager().active() I couldnt do that.

Sep 20 '18 02:09 jeacott1

the decision to track scopes with threadlocals

Do you have a suggestion for a better alternative?

Sep 20 '18 03:09 yurishkuro

@yurishkuro I'm not sure tbh, and certainly don't have any good answer. I've tracked my own tracing contexts in a singleton trie model, using keys that make sense available in my application to find the active span or parent in any given circumstance. I think I might try to instrument the returned futures from the various executors and services and attempt to track the start and end of each future to maintain the tracing that way instead of code scattered everywhere. carrying the context across method calls etc if there is nothing useful to hook is very problematic though and so far the only options I can think of are all very invasive except for the threadlocal in short bursts to carry the context where I'm sure its not jumping threads.

Sep 21 '18 05:09 jeacott1

Not sure I follow. A central cache (your trie) is of no help unless you explicitly propagate those IDs. The point of threadlocals is that you don't pass anything between functions yet preserve the context. Take spring boot - by adding a few jars to the classpath you can get your application traced with zero code changes. I'm not aware of any technique that can do that without threadlocals or relying on some bespoke, already explicitly propagated IDs.

Sep 21 '18 05:09 yurishkuro