ddtrace==2.13.1 profiler issue
After i updated our docker's ddtrace from ddtrace==2.9.3 to ddtrace==2.13.1, we observed some weird behaviors on profiler. We upgraded on Oct 11.
Heap - Bytes in Use drop to 0
Allocated Bytes Significantly increased -
Huge chunk of memory consumed by periodic
ensure_binary
Increased CPU Time
any updates?
@mabdinur @taegyunkim any updates on this?
@doudoujay Hi could you share environment variables that you're giving to ddtrace? e.g. DD_PROFILING_* and DD_TRACE_*. Could you also share Python version too?
ENV DD_LOGS_ENABLED=true
ENV DD_LOGS_INJECTION=true
ENV DD_TRACE_ENABLED=true
ENV DD_PROFILING_ENABLED=true
# https://docs.datadoghq.com/tracing/trace_collection/custom_instrumentation/python/otel/
ENV DD_TRACE_OTEL_ENABLED=true
# https://docs.datadoghq.com/profiler/connect_traces_and_profiles/#identify-code-hotspots-in-slow-traces
# only available for ver 2.10.5+
ENV DD_PROFILING_TIMELINE_ENABLED=true
# https://docs.datadoghq.com/dynamic_instrumentation/enabling/python/?tab=environmentvariables
ENV DD_DYNAMIC_INSTRUMENTATION_ENABLED=true
# https://docs.datadoghq.com/tracing/metrics/runtime_metrics/python/
ENV DD_RUNTIME_METRICS_ENABLED=true
# https://docs.datadoghq.com/infrastructure/process/?tab=docker
ENV DD_PROCESS_CONFIG_PROCESS_COLLECTION_ENABLED=true
# code tracking
ARG DD_GIT_REPOSITORY_URL
ARG DD_GIT_COMMIT_SHA
ENV DD_GIT_REPOSITORY_URL=${DD_GIT_REPOSITORY_URL}
ENV DD_GIT_COMMIT_SHA=${DD_GIT_COMMIT_SHA}
@taegyunkim
-
ensure_binary_or_emptyshows up in your profiles as you've setDD_PROFILING_TIMELINE_ENABLED. The flag enables native profile exporter which usesensure_binary_or_emptyto convert Pythonstrto utf-8 encodedbyteswhich is then passed to C++ native code asstd::string_view. - You don't see any frame below
ensure_binary_or_emptyas the function calls CPython's built-instr.encode()which is implemented in C here and dd-trace-py profiler doesn't show native frames. - Looks like
str.encode()always creates an object, allocating memory, even if the given str object is already in the desired encoding, in our caseutf-8. We probably should use lower level CPython API, probablyPyUnicode_AsUTF8AndSize(), which simply returnsconst char*, and pass that to C++ native code.
Thank you for pointing this out @doudoujay, will soon work on a fix for this, < 2 weeks.
So i guess for short term solution would be we turn off DD_PROFILING_TIMELINE_ENABLED? @taegyunkim
Yes, could you try turning it off?
@taegyunkim hey its still taking significant chunk of CPU cycle and memory -
dd version 2.14.4
dd flags:
ENV DD_LOGS_ENABLED=true
ENV DD_LOGS_INJECTION=true
ENV DD_TRACE_ENABLED=true
ENV DD_PROFILING_ENABLED=true
# https://docs.datadoghq.com/tracing/trace_collection/custom_instrumentation/python/otel/
ENV DD_TRACE_OTEL_ENABLED=true
# https://docs.datadoghq.com/dynamic_instrumentation/enabling/python/?tab=environmentvariables
ENV DD_DYNAMIC_INSTRUMENTATION_ENABLED=true
# https://docs.datadoghq.com/tracing/metrics/runtime_metrics/python/
ENV DD_RUNTIME_METRICS_ENABLED=true
# https://docs.datadoghq.com/infrastructure/process/?tab=docker
ENV DD_PROCESS_CONFIG_PROCESS_COLLECTION_ENABLED=true
# code tracking
ARG DD_GIT_REPOSITORY_URL
ARG DD_GIT_COMMIT_SHA
ENV DD_GIT_REPOSITORY_URL=${DD_GIT_REPOSITORY_URL}
ENV DD_GIT_COMMIT_SHA=${DD_GIT_COMMIT_SHA}
@taegyunkim any updates?
@doudoujay that flamegraph doesn't make sense to me as it shows ensure_binary_or_empty even though you don't have none of DD_PROFILING_EXPORT_LIBDD_ENABLED, DD_PROFILING_STACK_V2_ENABLED, or DD_PROFILING_TIMELINE_ENABLED. Could you check again your DD_* environment variables and flamegraph? You can also try to filter the profile using profiler_config:*py* to make sure that the flamegraph you get is only from those services without above environment variables set.
@doudoujay Can you also try enabling DD_PROFILING_STACK_V2_ENABLED?
Could you check again your DD_* environment variables and flamegraph Im sure i didnt turn on
DD_PROFILING_TIMELINE_ENABLEDor any of them. @taegyunkim
I will try enable DD_PROFILING_STACK_V2_ENABLED. Whats the expected behavior? @taegyunkim
will soon work on a fix for this, < 2 weeks. When do we expect the fix is in production?
DD_PROFILING_STACK_V2_ENABLED enables our new stack profiler which implemented in native languages, mostly C++/Rust. When you turn it on it also turn on DD_PROFILING_EXPORT_LIBDD_ENABLED and will show less of ensure_binary_or_empty().
Missing live heap profile looks like a separate issue, and we'll need to investigate more on that.
And do you mind sharing what Python version you're using? From our testing, Python 3.12 doesn't have the issue ensure_binary_or_empty() allocating a lot of memory.
And do you mind sharing what Python version you're using? From our testing, Python 3.12 doesn't have the issue ensure_binary_or_empty() allocating a lot of memory. @taegyunkim Im using python 3.1
FROM python:3.10
saw you closed this did you got it fixed?
Hello @taegyunkim
We use python 3.12.2 and we run into high CPU usage issue when upgrading ddtrace.
We tried last week ddtrace 2.16.0 and we still notice CPU increase:
We had to revert to 2.9.2.
Seeing the same thing, just rolled back
Using python:3.12-slim and ddtrace 2.18.1 with ddtrace-run uvicorn caused high CPU usage. I rolled back to 2.9.2 and it worked for me.