Add `SynchronousNonBlockingStepExecution.ExecutorServiceAugmentor` extension point
Allow the executorService to be augmented.
This avoids having to use reflection by dependent plugins.
See https://github.com/jenkinsci/opentelemetry-plugin/pull/1139 for the consumer.
Updated with comment from @dwnusbaum
opentelemetrywants to make it possible for other plugins to augment the traces produced for Jenkins builds. See jenkinsci/opentelemetry-plugin#896 and jenkinsci/opentelemetry-plugin#909 for more context. See jenkinsci/junit-sql-storage-plugin#413 for an example of a plugin using the OTel context to emit custom spans as part of thejunitstep.In order for this to work, plugins need access to an OTel
Contextin their build steps and/or related code so that they can produce aSpanwith the correct parent. By default, OTel contexts are stored as aThreadLocalvariable.opentelemetry's synchronousGraphListenerimplementation sets up this context on the CPS VM thread when a new step starts. For Pipeline steps with fully synchronousStepExecutions, that is enough for context propagation to work correctly within the extent ofStepExecution.start. However, for steps whose execution runs on another thread, even if they complete synchronously from the perspective of the Pipeline (e.g.SynchronousNonBlockingStepExecution,GeneralNonBlockingStepExecution), the context will be unavailable, and so any created OTel span will not have the correct parent.
opentelemetryplugin currently handles this case by reflectively modifyingSynchronousNonBlockingStepExecution.executorServiceto wrap it withio.opentelemetry.context.Context.taskWrapping, which automatically propagates the thread local OTel context to tasks run on the executor service. This works, but this kind of reflection is brittle, and we would like to support this use case in a more maintainable way. This PR currently just exposes a@Restricted(Beta.class)API to allowopentelemetryto augmentSynchronousNonBlockingStepExecution.executorServicewithContext.taskWrappingwithout using reflection.There may be other ways to accomplish the same use case, but we have not explored them at this time: For example:
- Perhaps we should instead encourage plugins with build steps that want to support tracing to always look up their parent span using these methods and then set up the context manually as needed? For
junit-sql-storagethis would mean that thejunitstep would need to be updated to look up the span and then set up the thread local context for methods in the custom storage provider.- Similarly, could we somehow inject the OTel
Contextinto theStepContextfor each step automatically, and then have some kind of method likeOtelPlugin.useContext(StepContext)that sets up the thread local for plugins that need it? This would only work for PipelineSteps, but maybe that's ok?- There is a
ContextStorageProviderAPI. Perhaps we could implement this in a way that works better for our use case? EDIT: Or perhaps we could invert the dependency as described in: https://github.com/jenkinsci/workflow-step-api-plugin/pull/226#issuecomment-3032799853
Testing done
Submitter checklist
- [x] Make sure you are opening from a topic/feature/bugfix branch (right side) and not your main branch!
- [x] Ensure that the pull request title represents the desired changelog entry
- [x] Please describe what you did
- [ ] Link to relevant issues in GitHub or Jira
- [ ] Link to relevant pull requests, esp. upstream and downstream changes
- [x] Ensure you have provided tests that demonstrate the feature works or the issue is fixed
I am not sure I follow the use case, and there is no downstream PR.
My general recommendation for any code creating an ExecutorService is
try { // TODO Java 21+
return (ExecutorService) Executors.class.getMethod("newVirtualThreadPerTaskExecutor").invoke(null);
} catch (NoSuchMethodException x) {
return Executors.newCachedThreadPool(/* as before */);
} catch (Exception x) {
throw new AssertionError(x);
}
which should probably be made into an API in Jenkins core.
I am not sure I follow the use case, and there is no downstream PR.
My general recommendation for any code creating an
ExecutorServiceistry { // TODO Java 21+ return (ExecutorService) Executors.class.getMethod("newVirtualThreadPerTaskExecutor").invoke(null); } catch (NoSuchMethodException x) { return Executors.newCachedThreadPool(/* as before */); } catch (Exception x) { throw new AssertionError(x); }which should probably be made into an API in Jenkins core.
Just filed the downstream PR: https://github.com/jenkinsci/opentelemetry-plugin/pull/1139
return (ExecutorService) Executors.class.getMethod("newVirtualThreadPerTaskExecutor").invoke(null);
My (limited) understanding of the current state with virtual threads was that unless you are running homogenous tasks that you know do not use synchronized, you have to be wary pending https://openjdk.org/jeps/491 in Java 24. In Jenkins at least we often use ExecutorService for heterogenous tasks that are not known up front and synchronized is common in the ecosystem, so I wasn't sure about using virtual threads just yet. Have you found them to work fine in practice?
FTR @jgreffe if you want to update the PR description here is some extra context:
opentelemetry wants to make it possible for other plugins to augment the traces produced for Jenkins builds.
See https://github.com/jenkinsci/opentelemetry-plugin/pull/896 and https://github.com/jenkinsci/opentelemetry-plugin/pull/909 for more context. See https://github.com/jenkinsci/junit-sql-storage-plugin/pull/413 for an example of a plugin using the OTel context to emit custom spans as part of the junit step.
In order for this to work, plugins need access to an OTel Context in their build steps and/or related code so that they can produce a Span with the correct parent. By default, OTel contexts are stored as a ThreadLocal variable. opentelemetry's synchronous GraphListener implementation sets up this context on the CPS VM thread when a new step starts. For Pipeline steps with fully synchronous StepExecutions, that is enough for context propagation to work correctly within the extent of StepExecution.start. However, for steps whose execution runs on another thread, even if they complete synchronously from the perspective of the Pipeline (e.g. SynchronousNonBlockingStepExecution, GeneralNonBlockingStepExecution), the context will be unavailable, and so any created OTel span will not have the correct parent.
opentelemetry plugin currently handles this case by reflectively modifying SynchronousNonBlockingStepExecution.executorService to wrap it with io.opentelemetry.context.Context.taskWrapping, which automatically propagates the thread local OTel context to tasks run on the executor service. This works, but this kind of reflection is brittle, and we would like to support this use case in a more maintainable way. This PR currently just exposes a @Restricted(Beta.class) API to allow opentelemetry to augment SynchronousNonBlockingStepExecution.executorService with Context.taskWrapping without using reflection.
There may be other ways to accomplish the same use case, but we have not explored them at this time: For example:
- Perhaps we should instead encourage plugins with build steps that want to support tracing to always look up their parent span using these methods and then set up the context manually as needed? For
junit-sql-storagethis would mean that thejunitstep would need to be updated to look up the span and then set up the thread local context for methods in the custom storage provider. - Similarly, could we somehow inject the OTel
Contextinto theStepContextfor each step automatically, and then have some kind of method likeOtelPlugin.useContext(StepContext)that sets up the thread local for plugins that need it? This would only work for PipelineSteps, but maybe that's ok? - There is a
ContextStorageProviderAPI. Perhaps we could implement this in a way that works better for our use case?
you have to be wary
Sure, it depends on what the thread is doing. Anyway that is irrelevant to the apparent purpose of this API based on the downstream PR.
you have to be wary
Sure, it depends on what the thread is doing. Anyway that is irrelevant to the apparent purpose of this API based on the downstream PR.
@jglick : then should I apply your suggestion? https://github.com/jenkinsci/workflow-step-api-plugin/pull/226#issuecomment-3019888216
Looking into a similar case in https://github.com/jenkinsci/opentelemetry-plugin/issues/1137, I also thought about instead adding a dependency on opentelemetry-api here and then just directly adding Context.taskWrapping unconditionally, and then deleting the relevant code in opentelemetry.
I am a little more hesitant about adding dependencies to this plugin than to github-branch-source though, and we would probably still need an indirection via an optional extension for FIPS compliance reasons. I am inclined to stick with the current approach for now, wait a bit for OTel usage to increase and for stuff like https://github.com/jenkinsci/opentelemetry-api-plugin/pull/45, and then reconsider that approach.
adding a dependency on
opentelemetry-apihere
I would rather avoid that.
@jglick , @dwnusbaum , could you have a look at the last state? 🙏🏼
@jgreffe This looks more or less fine to me, but I think we should wait until we have time to discuss with the OTel maintainers in a week or so to make sure we are aligned on the various changes we have been looking into. I don't want to merge this until we are confident that the downstream PR will be accepted.
Moving it to draft then @dwnusbaum