Attempts to collect Datadog traces when DATADOG_TRACE_ADDR is not provided
Describe the bug
In the documentation, it states that the env var DATADOG_TRACE_ADDR is the
URI of the Datadog trace agent. If not provided, traces will not be collected. Example: localhost:8126"
We've found that even if we do not set DATADOG_TRACE_ADDR, the relay still attempts to post traces. In our service, we see many error logs like:
Jun 4 02:08:29: 2021/06/04 02:08:29 Datadog Exporter error: Post "http://localhost:8126/v0.4/traces": dial tcp 127.0.0.1:8126: connect: connection refused.
Below, I do a quick dive into the code that supports this hypothesis.
To reproduce
Don't populate DATADOG_TRACE_ADDR and you should see that we still attempt to post traces.
Expected behavior
We expect ld-relay to not try to emit Datadog traces if DATADOG_TRACE_ADDR is not populated so that it matches the documentation.
SDK version ld-relay:6.1.0
Additional context
I did a cursory dive into the code which reinforces my belief that this bug exists.
I believe this is where we transform env vars into the MetricsConfig: https://github.com/launchdarkly/ld-relay/blob/fc38d2ffb8c27d0ecc0d32bb9ec98dfbac8e9e12/config/config_from_env.go#L124-L130
We call datadog.NewExporter with datadog.Options that will have the traceAddr - in our use case, traceAddr is empty string. https://github.com/launchdarkly/ld-relay/blob/fc38d2ffb8c27d0ecc0d32bb9ec98dfbac8e9e12/internal/core/internal/metrics/datadog.go#L35-L39
Crossing over into opencensus-go-exporter-datadog, this is the signature of NewExporter. We call newTraceExporter. https://github.com/DataDog/opencensus-go-exporter-datadog/blob/9baf37265e837038fae9fc59949a3810bce1417f/datadog.go#L102-L111
newTraceExporter is defined here. We end up making this call with traceAddr - newTransport(o.TraceAddr).upload, - https://github.com/DataDog/opencensus-go-exporter-datadog/blob/9baf37265e837038fae9fc59949a3810bce1417f/trace.go#L58
newTransport is defined here. The first four lines show that we will default the address to 8126 if traceAddr is empty string:
func newTransport(addr string) *transport {
if addr == "" {
addr = defaultTraceAddr
}
https://github.com/DataDog/opencensus-go-exporter-datadog/blob/9baf37265e837038fae9fc59949a3810bce1417f/transport.go#L37-L40
The transport's upload method then tries to make HTTP POST requests to the addr. In our case, this is the default address they provide, since addr was empty string. https://github.com/DataDog/opencensus-go-exporter-datadog/blob/9baf37265e837038fae9fc59949a3810bce1417f/transport.go#L74
Hmm... from what I'm seeing in the opencensus-go-exporter-datadog code, it's not clear to me that this integration actually supports not using trace. It uses a default value for TraceAddr if you leave that empty, and it also uses a default value for Service if you leave that empty. So, even though their documentation for NewExporter uses the phrase "When using trace" which implies that you might not always be using it, I don't see a way to turn it off.
Thanks for the response! We find that the errors from trying to post traces clutters our logs, do you have thoughts on ways we could move forwards to eliminate the noise in the meantime?
Is some way to disable the tracing something that might be in the roadmap?
Also, do you think it's worthwhile updating the docs to make it more clear that we will still try to emit traces even if DATADOG_TRACE_ADDR is not provided to prevent future confusion?
Well, I should have been clearer that 1. we haven't investigated this any further yet than just looking at the specific code you linked to, and 2. just because the DataDog client always wants to enable the tracing service doesn't necessarily mean it would make those requests, if the Relay Proxy code chose to withhold that part of the data. It's possible that they only make requests if we actively push traces into the client; I haven't seen anything about that in their docs, so we would have to do some testing. If nothing else, we would want to update the Relay Proxy docs to be clearer on what the behavior is. But it's too soon to say where this might be on the roadmap.
We would also need to look at how any change in behavior would affect the other metrics integrations. Since we're using an abstraction, OpenCensus (which will eventually be replaced by the newer model, OpenTelemetry), we would need to make the state of "should we be collecting traces" available to lower-level logic that only deals with that abstraction and doesn't know anything about the DataDog configuration. So that just means that it wouldn't be purely a trivial change of "if this parameter is empty, call the DataDog API differently."
Unfortunately no workarounds come to mind right now in terms of avoiding the noisy error logging. Is it a problem for you to just allow it to collect the traces?
This issue is marked as stale because it has been open for 30 days without activity. Remove the stale label or comment, or this will be closed in 7 days.
We're seeing this issue as well, with 7.5.2.
Is it a problem for you to just allow it to collect the traces?
I'd rather not have to configure and pay for Datadog tracing just to get rid of these error logs.