opentelemetry-cpp icon indicating copy to clipboard operation
opentelemetry-cpp copied to clipboard

Optimize TraceZ Span Processing Speeds

Open jajanet opened this issue 5 years ago • 5 comments

Is your feature request related to a problem? For the initial iteration of zPages, more specifically TraceZ, spans are grabbed and temporarily stored using the TraceZ span processor.

The OT C++ span processors interface doesn't store any spans by default, and other current implementations pass the responsibility of ownership to an exporter at most. The TraceZ span processor functionality deviates from these by storing both running and completed spans, keeping references and ownership of them respectively.

The processor's containers are modified whenever a snapshot getter is called or a span starts/ends, which causes potential thread safety issues when the functions are called concurrently since these containers are shared across these functions. When different functions attempt to read/write to the same place in memory simultaneously, this causes a program to crash.

In order to make the span processor thread-safe, lock guards were added at these functions. At a large scale where many spans could be processed at once, this could potentially make TraceZ scale poorly speed-wise.

Describe the solution you'd like We want to consider solutions that are also fast while being thread-safe. Some proposed solutions include:

Describe alternatives you've considered Ideally, we also want to reduce contention (how long services query and use the same places in memory). We attempted to do this through copy-on-write and are considering other methods of doing so.

Additional context

jajanet avatar Jul 16 '20 16:07 jajanet

I think you have two problems here that you should reason about in separation:

  1. The thread-safety of SpanData itself. The recordable implementation SpanData doesn't give thread safety guarantees in between OnStart and OnEnd processor calls. When attributes are accessed in between those calls, another thread might write at the same time. To eliminate that possibility, I'd recommend to implement a custom Recordable that gives the needed thread safety guarantees.
  2. Making the access to the vector storing running and ended spans thread safe (as aggregator and processor access those containers from different threads).

The lock-free circular buffer would address the second issue. I'm not sure about the details of the proxy/shim approach.

pyohannes avatar Jul 17 '20 03:07 pyohannes

@pyohannes Thanks for the thorough yet concise explanation! Based on your recommendation, we're opening a couple PRs soon to address this:

  1. PR for a custom recordable, which is essentially a copy of span_data but with locks in the getter/setter methods
  2. PR to change the recordable used in the processor to this new custom recordable, as casting more than once (recordable -> span_data -> threadsafe_span_data) seems unnecessary if that's the span data we'll use in the aggregator.

I don't know the details behind the proxy/shim approach, but will update the issue accordingly when I get more information

jajanet avatar Jul 20 '20 19:07 jajanet

This issue was marked as stale due to lack of activity. It will be closed in 7 days if no furthur activity occurs.

github-actions[bot] avatar Nov 30 '21 01:11 github-actions[bot]

Closed as inactive. Feel free to reopen if this is still an issue.

github-actions[bot] avatar Dec 07 '21 01:12 github-actions[bot]

This issue was marked as stale due to lack of activity. It will be closed in 7 days if no furthur activity occurs.

github-actions[bot] avatar Feb 06 '22 01:02 github-actions[bot]

Closing, as zpages is removed, see #2292

marcalff avatar Dec 05 '23 23:12 marcalff