pyroscope-java icon indicating copy to clipboard operation
pyroscope-java copied to clipboard

The jvm_threads_daemon_threads indicator will gradually become negative

Open liuxuzxx opened this issue 8 months ago • 18 comments

We introduced pyroscope.jar as an agent. After the process worked for about 20 minutes, we found that the jvm_threads_daemon_thread indicator obtained from spring-boot-actuator gradually changed from positive to negative. When we removed the pyroscope.jar agent, we found that it was normal.

Image

liuxuzxx avatar Jun 16 '25 04:06 liuxuzxx

The first three monitoring data in this figure are configured with pyroscope.jar as the agent. It can be seen that the jvm_threads_daemon_thread indicator shows a linear function property, gradually decreasing to a negative number. The last book removes pyroscope.jar as the agent monitoring, and it is normal.

liuxuzxx avatar Jun 16 '25 04:06 liuxuzxx

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cpaas-msg-hub-deployment
spec:
  template:
    spec:
      containers:
        - env:
            - name: JAVA_TOOL_OPTIONS
              value: >-
                ......
                -javaagent:/skywalking/agent/skywalking-agent.jar
                -javaagent:/shared-data/pyroscope/pyroscope.jar #use -javaagent
                -Dcom.rocketmq.remoting.client.worker.size="10"
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.name
            - name: SW_AGENT_NAME
              value: cpaas-msg-hub
            - name: PYROSCOPE_AGENT_ENABLED
              value: 'true'
            - name: PYROSCOPE_APPLICATION_NAME
              value: cpaas-msg-hub
            - name: PYROSCOPE_PROFILING_INTERVAL
              value: 30ms
            - name: PYROSCOPE_FORMAT
              value: jfr
            - name: PYROSCOPE_PROFILER_EVENT
              value: itimer
            - name: PYROSCOPE_PROFILER_ALLOC
              value: 32k
            - name: PYROSCOPE_PROFILER_LOCK
              value: 10ms
            - name: PYROSCOPE_SERVER_ADDRESS
              value: 'http://172.16.15.161:30424'
            - name: PYROSCOPE_LABELS
              value: 'env=test,zone=cloud,podname=$(POD_NAME)'
          image: 'xxxxxx/cpaas-test/cpaas-msg-hub:v1.8.3-8eecaaff295'
          name: cpaas-msg-hub
      initContainers:
        - command:
            - sh
            - '-c'
            - cp /opt/pyroscope.jar /shared-data/pyroscope/
          image: 'xwharbor.wxchina.com/cpaas/pyroscope:v2.1.2'
          name: pyroscope
          volumeMounts:
            - mountPath: /shared-data/pyroscope
              name: pyroscope
      volumes:
        - emptyDir: {}
          name: pyroscope


liuxuzxx avatar Jun 16 '25 04:06 liuxuzxx

Can you check if it happens with just using async-profiler https://github.com/async-profiler/async-profiler ? Just to make sure the issue is in the pyroscope jar and not the async-profiler itself.

korniltsev avatar Jun 16 '25 05:06 korniltsev

OK, let me try it out and see if there is any problem when using only async-profile

liuxuzxx avatar Jun 16 '25 05:06 liuxuzxx

When using pyroscope.jar as the agent, will the memory usage of the Java process increase compared to not using pyroscope.jar? Currently, it has been verified that under the current configuration, the performance is basically not affected.

liuxuzxx avatar Jun 16 '25 05:06 liuxuzxx

My environment is

openjdk version "1.8.0_342"
OpenJDK Runtime Environment (build 1.8.0_342-b07)
OpenJDK 64-Bit Server VM (build 25.342-b07, mixed mode)

liuxuzxx avatar Jun 16 '25 05:06 liuxuzxx

I see the requirements for async-profiler are as follows:

make
GCC 7.5.0+ or Clang 7.0.0+
Static version of libstdc++ (e.g. on Amazon Linux 2023: yum install libstdc++-static)
JDK 11+

Is JDK8 not working?

@korniltsev

liuxuzxx avatar Jun 16 '25 05:06 liuxuzxx

It requires jdk11 to build the profiler.

It should work on jdk8 if you already have binaries: https://github.com/async-profiler/async-profiler/blob/981619680ea6c93768df6253e3a6d87d4c33cd2b/.github/workflows/test-and-publish-nightly.yml#L108

korniltsev avatar Jun 16 '25 07:06 korniltsev

It requires jdk11 to build the profiler.

It should work on jdk8 if you already have binaries: https://github.com/async-profiler/async-profiler/blob/981619680ea6c93768df6253e3a6d87d4c33cd2b/.github/workflows/test-and-publish-nightly.yml#L108

OK, thanks

liuxuzxx avatar Jun 16 '25 10:06 liuxuzxx

We have a module based on JDK21. After following the same configuration, we found that the jvm_threads_daemon_threads indicator is normal, but a little more than before.

Image

From the monitoring chart above, we can see that the jvm_threads_daemon_threads indicator has increased by about 3 daemon threads before and after using pyroscope.jar. It may be that pyroscope.jar needs some threads to report, so it is understandable that daemon is set.

liuxuzxx avatar Jun 18 '25 01:06 liuxuzxx

JDK21 is OK!

liuxuzxx avatar Jun 18 '25 04:06 liuxuzxx

My java process id is: 1448183. When I use async-profiler./asprof start 1448183, and then wait for 10 minutes, execute./asprof stop 1448183. During this period, I see that the jvm_threads_daemon_threads monitoring data has not changed at all, but has changed from 28 to 29. The extra threads should be other monitoring data acquisition. However, when I use pyroscope.jar (version 2.1.2) as javaagent on the vm, I see that it will decrease by about 3-4 in 1 minute. However, I have tested the jdk21 version and it is normal.

@korniltsev

liuxuzxx avatar Jun 18 '25 04:06 liuxuzxx

The statistics of other threads are normal, and there is no difference with not adding the pyroscope.jar javaagent.

liuxuzxx avatar Jun 18 '25 04:06 liuxuzxx

My java environment:

liuxu@liuxu-B560M-AORUS-ELITE /media/liuxu/data/code/github > java -version

openjdk version "1.8.0_452"
OpenJDK Runtime Environment (build 1.8.0_452-8u452-ga~us1-0ubuntu1~24.04-b09)
OpenJDK 64-Bit Server VM (build 25.452-b09, mixed mode)

liuxuzxx avatar Jun 18 '25 05:06 liuxuzxx

I used the jstack command to obtain the thread stack of the corresponding Java process at intervals of 1 minute, and found that the daemon thread did not change, but the indicator exposed by spring-boot-starter-actuator: jvm_threads_daemon_threads was gradually decreasing.

liuxuzxx avatar Jun 18 '25 06:06 liuxuzxx

This is my experimental project.:

demo.tar.gz

My environment information

Java Version: 
   openjdk version "1.8.0_452"
  OpenJDK Runtime Environment (build 1.8.0_452-8u452-ga~us1-0ubuntu1~24.04-b09)
  OpenJDK 64-Bit Server VM (build 25.452-b09, mixed mode)

pyroscope.jar:
  2.1.2 version

I conducted the experiment according to the following steps:

  1. package jar
mvn clean package -Dmaven.test.skip=true -T 12

2.use the start.sh

#!/bin/sh

export PYROSCOPE_AGENT_ENABLED=true
export PYROSCOPE_APPLICATION_NAME=demo-web
export PYROSCOPE_FORMAT=jfr
export PYROSCOPE_PROFILER_EVENT=itimer
export PYROSCOPE_PROFILER_ALLOC=32k
export PYROSCOPE_PROFILER_LOCK=10ms
export PYROSCOPE_SERVER_ADDRESS=http://172.16.15.161:30424
export PYROSCOPE_LABELS="env=develop,zone=localhost"

java -javaagent:./pyroscope.jar -jar demo-0.0.1-SNAPSHOT.jar

3.check spring actuator prometheus metrics

http://localhost:8090/actuator/prometheus

...
jvm_threads_daemon_threads ......
...

liuxuzxx avatar Jun 18 '25 06:06 liuxuzxx

  1. spring actuator use JvmThreadMetrics.java
public void bindTo(MeterRegistry registry) {
        ThreadMXBean threadBean = ManagementFactory.getThreadMXBean();

        ......

        Gauge.builder("jvm.threads.daemon", threadBean, ThreadMXBean::getDaemonThreadCount)
                .tags(tags)
                .description("The current number of live daemon threads")
                .baseUnit(BaseUnits.THREADS)
                .register(registry);

        ......
    }
  1. ThreadImpl.java
public int getDaemonThreadCount() {
        return this.jvm.getDaemonThreadCount();
}

When I debugged the getDaemonThreadCount method in ThreadImpl, I found that it did decrease gradually and then became negative.

liuxuzxx avatar Jun 30 '25 09:06 liuxuzxx

I used the jstack command to print the thread stack of the pyroscope-java agent, and I found that there were two daemon threads.

"Async-profiler Timer" #538 daemon prio=5 os_prio=0 tid=0x00007e9810003000 nid=0x279789 runnable [0x0000000000000000]
"PyroscopeProfilingScheduler" #46 daemon prio=5 os_prio=0 tid=0x00007e98fd1c4000 nid=0x278fc3 waiting on condition [0x00007e9821dfd000]

But when I print a jstack every 10 seconds or so, I find that the ids of the two daemon threads output each time are different

"Async-profiler Timer" #644 daemon prio=5 os_prio=0 tid=0x00007e98100cb800 nid=0x279b96 runnable [0x0000000000000000]
"PyroscopeProfilingScheduler" #46 daemon prio=5 os_prio=0 tid=0x00007e98fd1c4000 nid=0x278fc3 waiting on condition [0x00007e9821dfd000]

The number of jvm_threads_daemon_threads becomes negative. Could this be caused by the high frequency creation and destruction of the daemon thread Async-profiler Timer?

liuxuzxx avatar Jun 30 '25 09:06 liuxuzxx