The type of kafka_producer_connection_count keeps changing between counter and gauge
OpenTelemetry java agent version: 1.20.2 Kafka version: 3.1.1 OpenTelemetry Collector version: 0.66.0
OpenTelemetry Collector config file:
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:14317
otlp/dummy: # Dummy receiver for the metrics pipeline
protocols:
grpc:
endpoint: localhost:65535
processors:
servicegraph:
metrics_exporter: prometheus/servicegraph # Exporter to send metrics to
dimensions: [cluster, namespace] # Additional dimensions (labels) to be added to the metrics extracted from the resource and span attributes
store: # Configuration for the in-memory store
ttl: 2s # Value to wait for an edge to be completed
max_items: 200 # Amount of edges that will be stored in the storeMap
exporters:
prometheus/servicegraph:
endpoint: 0.0.0.0:9091 # to prometheus
otlp:
endpoint: http://localhost:4317 # to jaeger
tls:
insecure: true
logging:
logLevel: debug
service:
pipelines:
traces:
receivers: [otlp]
processors: [servicegraph]
exporters: [logging, otlp]
metrics/servicegraph:
receivers: [otlp]
processors: []
exporters: [prometheus/servicegraph]
refresh http://localhost:9091/metrics in browser, I find that kafka_producer_connection_count keeps changing between counter and gauge
# HELP kafka_producer_connection_count The current number of active connections.
# TYPE kafka_producer_connection_count counter
kafka_producer_connection_count{client_id="producer-1",job="otel-demo-provider",kafka_version="3.1.1",spring_id="kafkaProducerFactory.producer-1"} 1
# HELP kafka_producer_connection_count The current number of active connections.
# TYPE kafka_producer_connection_count gauge
kafka_producer_connection_count{client_id="producer-1",job="otel-demo-provider"} 1
Hey @tuhao1020 , What kind of metrics does the javaagent export? Excluding the collector? Let's make sure there's no interference on the collector side first.
@mateuszrzeszutek The metrics exported by the Java agent always keep the gauge type, you mean the collector modified the type? Theoretically, collector does not modify this type, right?
Honestly, I've no idea if the collector modifies it or not - which is why we should first try to pinpoint which of these two (agent, collector) causes this to happen.
@mateuszrzeszutek #7271 Does it have anything to do with this? I'm using kafka 3.3.1, but kafka_version of the metrics are 3.1.1
No, that PR is about Spring Kafka, it has a different versioning scheme from Kafka.
Same problem.
Hi @mateuszrzeszutek I still reproduce this error in a simple Spring-kafka demo. (https://github.com/MaheshIare/spring-boot-kafka-demo/tree/master?tab=readme-ov-file)
Any ideas on how to troubleshoot this?
Java agent version : 1.24.0 Using the configuration below: Java Env config:
OTEL_EXPORTER_OTLP_METRICS_ENDPOINT=http://localhost:14318;OTEL_EXPORTER_PROMETHEUS_PORT=10000;OTEL_METRICS_EXPORTER=otlp;OTEL_SERVICE_NAME=test-kfk
CLI arguments:
-Dotel.instrumentation.runtime-metrics.experimental-metrics.enabled=true
VM:
-javaagent:/Users/yuan/Dev/IdeaProjects/otel-java-instrumentation/alauda-extension/build/libs/opentelemetry-javaagent-ext.jar
OTel-Collector version: 0.100.0 Config:
extensions:
# The health_check extension is mandatory for this chart.
# Without the health_check extension the collector will fail the readiness and liveliness probes.
# The health_check extension can be modified, but should never be removed.
health_check: {}
memory_ballast:
size_in_percentage: 40
receivers:
otlp/traces:
protocols:
grpc:
endpoint: :14317
otlp/metrics:
protocols:
grpc:
endpoint: :14318
zipkin:
exporters:
logging:
loglevel: info
otlp/metrics:
endpoint: :14318
tls:
insecure: true
prometheus:
endpoint: :8889
service:
extensions:
- health_check
- memory_ballast
telemetry:
logs:
level: info
metrics:
level: detailed
address: :8888
pipelines:
metrics:
receivers: [otlp/metrics]
exporters: [prometheus]
OTel Collector log as follows:
2024-05-29T18:44:55.442+0800 error [email protected]/log.go:23 error gathering metrics: collected metric kafka_consumer_connection_count label:{name:"client_id" value:"consumer-c92f3eab-2f4f-4e96-a394-1983d69e24ae-0"} label:{name:"job" value:"test-kfk"} gauge:{value:2} should be a Counter
{"kind": "exporter", "data_type": "metrics", "name": "prometheus"}
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusexporter.(*promLogger).Println
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/[email protected]/log.go:23
github.com/prometheus/client_golang/prometheus/promhttp.HandlerForTransactional.func1
github.com/prometheus/[email protected]/prometheus/promhttp/http.go:144
net/http.HandlerFunc.ServeHTTP
net/http/server.go:2122
net/http.(*ServeMux).ServeHTTP
net/http/server.go:2500
go.opentelemetry.io/collector/config/confighttp.(*decompressor).ServeHTTP
go.opentelemetry.io/collector/config/[email protected]/compression.go:147
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*Handler).ServeHTTP
go.opentelemetry.io/contrib/instrumentation/net/http/[email protected]/handler.go:212
go.opentelemetry.io/collector/config/confighttp.(*clientInfoHandler).ServeHTTP
go.opentelemetry.io/collector/config/[email protected]/clientinfohandler.go:28
net/http.serverHandler.ServeHTTP
net/http/server.go:2936
net/http.(*conn).serve
net/http/server.go:1995
2024-05-29T18:44:55.443+0800 error [email protected]/log.go:23 error gathering metrics: collected metric kafka_consumer_connection_count label:{name:"client_id" value:"consumer-c92f3eab-2f4f-4e96-a394-1983d69e24ae-0"} label:{name:"job" value:"test-kfk"} label:{name:"kafka_version" value:"2.6.0"} label:{name:"spring_id" value:"kafkaConsumerFactory.consumer-c92f3eab-2f4f-4e96-a394-1983d69e24ae-0"} counter:{value:2} should be a Gauge
{"kind": "exporter", "data_type": "metrics", "name": "prometheus"}
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusexporter.(*promLogger).Println
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/[email protected]/log.go:23
github.com/prometheus/client_golang/prometheus/promhttp.HandlerForTransactional.func1
github.com/prometheus/[email protected]/prometheus/promhttp/http.go:144
net/http.HandlerFunc.ServeHTTP
net/http/server.go:2122
net/http.(*ServeMux).ServeHTTP
net/http/server.go:2500
Is there anyone can fix it? I also have same issue only with kafka metric.
Which version of the agent do you use? @quanbisen I resolved this issue by upgrading to a higher 1.x version.
Which version of the agent do you use? @quanbisen I resolved this issue by upgrading to a higher 1.x version.
I use opentelemetry-javaagent - version: 2.7.0.
Which version of the agent do you use? @quanbisen I resolved this issue by upgrading to a higher 1.x version.
and my collector partial output log as below:
* collected metric kafka_producer_byte_total label:{name:"client_id" value:"producer-1"} label:{name:"host_arch" value:"amd64"} label:{name:"host_name" value:"test01-01"} label:{name:"instance" value:"e16e5e18-3d97-494b-b453-599114fd40fb"} label:{name:"job" value:"lebo-desk"} label:{name:"os_description" value:"Linux 3.10.0-1160.88.1.el7.x86_64"} label:{name:"os_type" value:"linux"} label:{name:"process_command_line" value:"/usr/java/jdk1.8.0_101/jre/bin/java -javaagent:../opentelemetry-agent/opentelemetry-javaagent.jar -Dotel.service.name=lebo-desk -Dotel.exporter.otlp.endpoint=http://10.0.8.48:4318 -Dspring.cloud.nacos.discovery.server-addr=nacos.lebo.lc:80 -Dspring.cloud.nacos.config.server-addr=nacos.lebo.lc:80 -Xms1000m -Xmx1000m lebo-desk.jar --spring.profiles.active=test"} label:{name:"process_executable_path" value:"/usr/java/jdk1.8.0_101/jre/bin/java"} label:{name:"process_pid" value:"2900360"} label:{name:"process_runtime_description" value:"Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 25.101-b13"} label:{name:"process_runtime_name" value:"Java(TM) SE Runtime Environment"} label:{name:"process_runtime_version" value:"1.8.0_101-b13"} label:{name:"service_instance_id" value:"e16e5e18-3d97-494b-b453-599114fd40fb"} label:{name:"service_name" value:"lebo-desk"} label:{name:"telemetry_distro_name" value:"opentelemetry-java-instrumentation"} label:{name:"telemetry_distro_version" value:"2.7.0"} label:{name:"telemetry_sdk_language" value:"java"} label:{name:"telemetry_sdk_name" value:"opentelemetry"} label:{name:"telemetry_sdk_version" value:"1.41.0"} label:{name:"topic" value:"lebocloud_campaign_push"} gauge:{value:68528} should be a Counter
* collected metric kafka_producer_response_total label:{name:"client_id" value:"producer-1"} label:{name:"host_arch" value:"amd64"} label:{name:"host_name" value:"test01-01"} label:{name:"instance" value:"3853ea0d-7c32-43fe-9076-535398225ec2"} label:{name:"job" value:"vipauth-out"} label:{name:"node_id" value:"node-0"} label:{name:"os_description" value:"Linux 3.10.0-1160.88.1.el7.x86_64"} label:{name:"os_type" value:"linux"} label:{name:"process_command_line" value:"/usr/java/jdk1.8.0_101/jre/bin/java -javaagent:../opentelemetry-agent/opentelemetry-javaagent.jar -Dotel.service.name=vipauth-out -Dotel.exporter.otlp.endpoint=http://10.0.8.48:4318 -Dlog4j2.formatMsgNoLookups=true -Xms1024m -Xmx1024m VipAuth-out.jar --spring.profiles.active=prd"} label:{name:"process_executable_path" value:"/usr/java/jdk1.8.0_101/jre/bin/java"} label:{name:"process_pid" value:"1433623"} label:{name:"process_runtime_description" value:"Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 25.101-b13"} label:{name:"process_runtime_name" value:"Java(TM) SE Runtime Environment"} label:{name:"process_runtime_version" value:"1.8.0_101-b13"} label:{name:"service_instance_id" value:"3853ea0d-7c32-43fe-9076-535398225ec2"} label:{name:"service_name" value:"vipauth-out"} label:{name:"service_version" value:"0.0.1-SNAPSHOT"} label:{name:"telemetry_distro_name" value:"opentelemetry-java-instrumentation"} label:{name:"telemetry_distro_version" value:"2.7.0"} label:{name:"telemetry_sdk_language" value:"java"} label:{name:"telemetry_sdk_name" value:"opentelemetry"} label:{name:"telemetry_sdk_version" value:"1.41.0"} counter:{value:24163} should be a Gauge
* collected metric kafka_producer_successful_authentication_total label:{name:"client_id" value:"producer-1"} label:{name:"host_arch" value:"amd64"} label:{name:"host_name" value:"test01-01"} label:{name:"instance" value:"99db74e7-5361-4a4d-97c2-5627f5d712a1"} label:{name:"job" value:"user-service-boot"} label:{name:"os_description" value:"Linux 3.10.0-1160.88.1.el7.x86_64"} label:{name:"os_type" value:"linux"} label:{name:"process_command_line" value:"/usr/java/jdk1.8.0_101/jre/bin/java -javaagent:../opentelemetry-agent/opentelemetry-javaagent.jar -Dotel.config.file=otel-config.properties -Dotel.service.name=user-service-boot -Dotel.exporter.otlp.endpoint=http://10.0.8.48:4318 -Xms256m -Xmx712m -jar user-service-boot-1.0.0.jar"} label:{name:"process_executable_path" value:"/usr/java/jdk1.8.0_101/jre/bin/java"} label:{name:"process_pid" value:"2221142"} label:{name:"process_runtime_description" value:"Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 25.101-b13"} label:{name:"process_runtime_name" value:"Java(TM) SE Runtime Environment"} label:{name:"process_runtime_version" value:"1.8.0_101-b13"} label:{name:"service_instance_id" value:"99db74e7-5361-4a4d-97c2-5627f5d712a1"} label:{name:"service_name" value:"user-service-boot"} label:{name:"service_version" value:"1.0.0"} label:{name:"telemetry_distro_name" value:"opentelemetry-java-instrumentation"} label:{name:"telemetry_distro_version" value:"2.7.0"} label:{name:"telemetry_sdk_language" value:"java"} label:{name:"telemetry_sdk_name" value:"opentelemetry"} label:{name:"telemetry_sdk_version" value:"1.41.0"} counter:{value:0} should be a Gauge
{"kind": "exporter", "data_type": "metrics", "name": "prometheus"}
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusexporter.(*promLogger).Println
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/[email protected]/log.go:23
github.com/prometheus/client_golang/prometheus/promhttp.HandlerForTransactional.func1
github.com/prometheus/[email protected]/prometheus/promhttp/http.go:144
net/http.HandlerFunc.ServeHTTP
net/http/server.go:2136
net/http.(*ServeMux).ServeHTTP
net/http/server.go:2514
go.opentelemetry.io/collector/config/confighttp.(*decompressor).ServeHTTP
go.opentelemetry.io/collector/config/[email protected]/compression.go:147
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*middleware).serveHTTP
go.opentelemetry.io/contrib/instrumentation/net/http/[email protected]/handler.go:229
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.NewMiddleware.func1.1
go.opentelemetry.io/contrib/instrumentation/net/http/[email protected]/handler.go:81
net/http.HandlerFunc.ServeHTTP
net/http/server.go:2136
go.opentelemetry.io/collector/config/confighttp.(*clientInfoHandler).ServeHTTP
go.opentelemetry.io/collector/config/[email protected]/clientinfohandler.go:28
net/http.serverHandler.ServeHTTP
net/http/server.go:2938
net/http.(*conn).serve
net/http/server.go:2009
hi @quanbisen, I think we would need a minimal sample app that reproduce the issue in order to understand what's going on.
This issue has been labeled as stale due to lack of activity and needing author feedback. It will be automatically closed if there is no further activity over the next 7 days.