apm-agent-java icon indicating copy to clipboard operation
apm-agent-java copied to clipboard

Public API ignore transaction option

Open cdalexndr opened this issue 6 years ago • 26 comments

Using Spring Boot, all @Scheduled methods are recorded. I have a method that fires regularly and I want to ignore it. An option to ignore should be added, like @IgnoreTransaction.

Even if I get the transaction using ElasticApm.currentTransaction() there is no way to cancel it.

cdalexndr avatar Sep 17 '19 12:09 cdalexndr

I agree that an API to ignore a transaction could be useful..

Regardless, our scheduling tracing looks for classes only in the defined application_packages - maybe you can make use of that if the transactions you want to ignore are in a specific package.

In case you don't want such transactions at all, you can also disable the scheduled instrumentation.

eyalkoren avatar Sep 18 '19 06:09 eyalkoren

@eyalkoren Thanks for the workarounds.

Additional use case: when @Scheduled takes a long time and records many spans (exceding transaction_max_spans) and I want to split it into multiple transactions, instead of coding many try-catch blocks for custom transaction initialization I can use @IgnoreTransaction(todo) on the @Scheduled task, and then use @CaptureTransaction on child method calls, resulting in cleaner code.

cdalexndr avatar Sep 18 '19 11:09 cdalexndr

So, the API addition should be:

  • add programmatic API, eg Transaction#ignore
  • add annotation API, eg @IgnoreTransaction

eyalkoren avatar Sep 18 '19 13:09 eyalkoren

The problem is that there might already be spans created and reported for this transaction. Should we disallow ignoring a transaction if there is already a span for it? We could indicate that by returning a boolean from Transaction#ignore.

felixbarny avatar Sep 18 '19 13:09 felixbarny

For my use case, only @IgnoreTransaction si required, and is easier to implement. When the creation of new transaction is triggered, if the method has this annotation, the transaction will be ignored. No need to bother with existing spans or server sync, because the transaction is just starting.

cdalexndr avatar Sep 18 '19 13:09 cdalexndr

@felixbarny yes, I had the same concern in mind. I think we can still disable the transaction and not send it, then the already-sent spans will be orphaned and invisible (unless specifically searched for).

Should we disallow ignoring a transaction if there is already a span for it?

I don't think so. It still makes sense to stop collecting and sending spans as soon as you know you don't want it. Whenever someone is using this to reduce overhead, it is still better to stop once you can instead of disallowing. And whenever someone just don't want to see them- it would work as well.

eyalkoren avatar Sep 18 '19 13:09 eyalkoren

One issue I see is that if there has already been an external span which propagated the trace context, then the resulting transaction is orphaned as well. But that's probably not a big problem as well.

The advantage of Transaction#ignore is that it would work for more use cases.

felixbarny avatar Sep 18 '19 15:09 felixbarny

The advantage of Transaction#ignore is that it would work for more use cases.

Right, like in this discussion

eyalkoren avatar Sep 18 '19 15:09 eyalkoren

Hi, any update or ETA of this point?

gjeanmart avatar Dec 16 '19 10:12 gjeanmart

This is a workaround, which we use now:

    public static void executeOutOfTransaction(Runnable runnable) {
        if ((ElasticApmInstrumentation.tracer == null) || (ElasticApmInstrumentation.tracer.getActive() == null)) {
            runnable.run();
        } else {
            final TraceContextHolder<?> activeTransaction = ElasticApmInstrumentation.tracer.getActive();
            boolean deactivated = false;
            try {
                activeTransaction.deactivate();
                deactivated = true;
            } catch (Exception e) {
                LOGGER.error("Failed to stop current APM transaction", e);
            }

            try {
                runnable.run();
            } finally {
                if (deactivated) {
                    // we have to active it again
                    activeTransaction.activate();
                }
            }
        }
    }

and the usage:

@Cron
public void largeCronJon() {
    for (var chunk in chunks) {
        ApmUtils.executeOutOfTransaction(() -> {
            processChunk(chunk);
        });
    }
}

@CaptureTransaction
private void processChunk(Object chunk) {
    // ...
}

This is not ideal, but solves the main problem for us

bugy avatar May 13 '20 09:05 bugy

But it will be great to have an annotation,too. Like @Traced where i can set value to false.

Frintrop avatar Jun 11 '20 07:06 Frintrop

Hello, I was wondering if there is any update or ETA ? I am facing the same issue. I am only interested in a method within a controller and would like to create a Transaction for this specific method, but the request is already generating a Transaction, and the method is only considered as a span when using @Traced Annotation. Thank you

montananyto-dev avatar Oct 29 '20 14:10 montananyto-dev

@montananyto-dev there is no ETA for that, but maybe there is a simpler solution for your case. If you don't need any automatic transaction creation in this service, you can use the disable_instrumentations to disable the relevant plugins that create the automatic transactions (probably servlet-api in your case). Alternatively, you can use the transaction_ignore_urls config to eliminate transaction creation for a subset of URLs and rely only on the API-based transactions for those. I hope this helps.

eyalkoren avatar Nov 05 '20 06:11 eyalkoren

@eyalkoren , thank you very much for your reply and thank you for your help. I will have a look at that today and see how it goes.

montananyto-dev avatar Nov 09 '20 08:11 montananyto-dev

@montananyto-dev FYI in case it is not you - a user in our forum reporting solving similar issue using the transaction_ignore_urls approach.

eyalkoren avatar Nov 11 '20 06:11 eyalkoren

I'm facing a similar problem. We use RabbitMQ for messaging and messages are put into a process-local queue upon receipt. This in-process queue is processed – kind of – asynchronously, but sometimes it is processed immediately within the same thread. Initially, I instrumented these calls manually by calling startTransactionWithRemoteParent in the relevant places. In other words, I have two transactions within a single process:

  1. The first transaction receives the message and puts it into the queue
  2. The second transaction processes messages from the queue

Now that the instrumentation rabbitmq is available, this "breaks" remote tracing of our messages: due to the asynchronous processing, many unrelated messages are put into the same APM transaction. It took my quite a while to find out why this is happening, due to several reasons:

  • startTransactionWithRemoteParent can be activated on the current thread, which will update the currentSpan to use the remote trace id, but not the currentTransaction (which will still use the previously active transaction – from the rabbitmq builtin instrumentation). From researching on this topic, I understand that it is illegal to activate another transaction/span, while one is already active
  • log correlation uses trace id from the transaction, not the active span (if it took it from the span, I wouldn't have noticed the breakage. Rather, everything would have worked as expected)

I have now disabled the rabbitmq instrumentation, but this is a rather drastic workaround: this affects all rabbitmq code. I looked into classes_excluded_from_instrumentation, but I haven't fully understood it yet. As far as I understand, I can put the rabbitmq consumer class into this list, but not my application class?

I see the following possible options for how to handle such cases (without being able to gauge their feasibility):

  • Provide an API to "suspend+resume" or "end" or even "discard" a transaction created by the builtin instrumentations of the agent.
  • Provide an API (method or annotation) to prevent an active transaction/span being propagated to executors or runnables.
  • Allow nested transactions, i.e. to have a stack of transactions onto which newly activated transactions can be pushed and closing/ending a transaction will pop it from the stack again.
    • Another option would be to close and end any active transaction, when a new transaction is activated. So transactions would not need to be stacked/nested, but would only exist sequentially in time. This means of course that there is no transaction to close when the now-closed transaction from the builtin instrumentation exits the system.

Thanks.

efdknittlfrank avatar Jan 18 '22 08:01 efdknittlfrank

您好,我看到以后就回复您

Traced avatar Jan 18 '22 08:01 Traced

@efdknittlfrank Did you try to remove your API code related to RabbitMQ when using the rabbitmq instrumentation? The API is not intended to compete with instrumentation plugins over transaction creation. Ideally, once we support a framework there is no need for transaction creation through the API. If the RabbitMQ support is broken when no public API is applied, please open an issue for that with as many details as possible like you provided above.

eyalkoren avatar Jan 19 '22 04:01 eyalkoren

@eyalkoren thanks for your answer.

I did not remove the API code, but I can see (from debug logs) that the rabbitmq instrumentation does not work (properly) for my use case. I don't think this is a bug of the rabbitmq instrumentation, but due to the fact that the expectations of what constitutes a "transaction" is different in our code: CompletableFutures are chained and each future should be its own transaction. Because they are chained (and executed by the same executor service), they will all be bunched together in a single transaction. From a pure technical standpoint, this is correct, but from the logical standpoint, this doesn't match what I am trying to monitor. I will try to create a separate issue with a repro when I find more time.

To summarize: rabbitmq instrumentation is working and not broken. But our code has its own assumptions where and whene a remote transaction should be resumed.

efdknittlfrank avatar Jan 19 '22 07:01 efdknittlfrank

Assuming that the RabbitMQ instrumentation properly creates and starts a transaction per message (through the instrumentation of com.rabbitmq.client.Consumer#handleDelivery), I think you just need to look at it differently: there is no need for you to manually create transactions through startTransactionWithRemoteParent anymore. You either do that or use the RabbitMQ instrumentation, but not both.

What you are now missing is probably proper context propagation between JVM threads. For example, if you attach the active automatically-created RabbitMQ transaction to the queue element before adding to the queue, then you can activate it when is being processed and end it when the CompletableFuture is completed. Take a look at our BlockingQueueContextPropagationTest, it may demonstrate what you need.

eyalkoren avatar Jan 19 '22 08:01 eyalkoren

What you are now missing is probably proper context propagation between JVM threads. For example, if you attach the active automatically-created RabbitMQ transaction to the queue element before adding to the queue, then you can activate it when is being processed and end it when the CompletableFuture is completed. Take a look at our BlockingQueueContextPropagationTest, it may demonstrate what you need.

@eyalkoren thanks for the link, I will take a look at the test. From your description it sounds like it would solve my problem.

The problem as I see it, is that a the wrong transaction is propagated to the executor/future in my case (and this is probably due to the very "creative" use of futures in the code. In fact, it is Supplier<CompletableFuture<?>> and Function<T, CompletableFuture<?>>. Subsequent futures are executed in the same thread). When debugging my problem, I tried to write a simple demo application which exhibits the same problem, but I didn't manage to do so.

What complicates things is that not all incoming messages are put into the queue, some are processed directly. Same are processed delayed. Some are processed in bulk. And I really want to avoid not ending a transaction/span, as this can cause memory leaks from what I have read in other discussions.

I will have a look at the test and play around a little bit with it. I'll come back when I have enough information to make progress, as to not spam this issue here.

EDIT. One thing I noticed in the test is the following:

                    try (Scope scope = span.activate()) {
                        String spanId = ElasticApm.currentSpan().getId();
                        Thread.sleep(20);
                        span.end();
                        result.complete(spanId);
                    }

All examples I have seen so far implement this as:

                    try (Scope scope = span.activate()) {
                        String spanId = ElasticApm.currentSpan().getId();
                        Thread.sleep(20);
                        result.complete(spanId);
                    } finally {
                        span.end();
                    }

Are both forms correct?

efdknittlfrank avatar Jan 19 '22 08:01 efdknittlfrank

And I really want to avoid not ending a transaction/span, as this can cause memory leaks from what I have read in other discussions.

No risk for memory leaks, unless you accumulate hard references to them and never release. The leak we are referring to is when spans/transactions are recycled and used after, but in new versions this can't happen as any span/transaction touched by the API is not recycled.

BTW, the messaging transactions may be ended quite quickly after the message is submitted to the queue, but you can still create async child spans from them and they will have the proper parent-child relations. I hope you find the way to do what you need. Start by eliminating your custom transaction creation. Good luck!

eyalkoren avatar Jan 19 '22 10:01 eyalkoren