cilium-cli icon indicating copy to clipboard operation
cilium-cli copied to clipboard

Prometheus can't collect metrics from `hubble-metrics` using `cilium hubble enable` command

Open Shunpoco opened this issue 3 years ago • 3 comments

Hi, I caught an unexpected behavior during running cilium hubble enable to enable hubble and gather its metrics using Prometheus.

Bug report

General Information

  • Cilium CLI version: I checked both v0.12.11 and the master branch
  • Orchestration system version in use: v1.25.4
  • Platform / infrastructure information: Building on VMs using kubeadm (kubernetes v1.23.9)

How to reproduce the issue

  1. Run cilium install with the options:
cilium install --helm-set prometheus.enabled=true --helm-set operator.prometheus.enabled=true
  1. Then run cilium enable hubble with the options:
cilium hubble enable --ui --helm-set hubble.metrics.enabled="{dns,drop,tcp,flow,icmp,http}"

Hubble resources are deployed, and hubble-metrics service is created.

Expected behavior Prometheus can access to hubble-metrics (by default, port 9965) and can gather metrics.

Actual behavior Prometheus didn't collect any metrics from the endpoint.

The cause of the problem

  1. The backend of hubble-metrics is the pod which has k8s-app=cilium (actually, this is cilium pods from cilium daemonset), and the target port is 9965 by default
  2. However, the daemonset doesn't expose the 9965 port:
$ kubectl get daemonsets.apps -n kube-system cilium -o yaml | grep -A20 ports
        ports:
        - containerPort: 4244
          hostPort: 4244
          name: peer-service
          protocol: TCP
        - containerPort: 9962
          hostPort: 9962
          name: prometheus
          protocol: TCP
        - containerPort: 9964
          hostPort: 9964
          name: envoy-metrics
          protocol: TCP
        readinessProbe:
...
  1. The cilium enable hubble command with --helm-set hubble.metrics.enabled={...} updates cilium-config configmap then restart cilium-xxx pods, and creates both hubble-peer and hubble-metrics service. However, it does not update cilium daemonset to add the port. We can see the behaviors around this part of the code: https://github.com/cilium/cilium-cli/blob/master/hubble/hubble.go#L627-L665

  2. As a result, because cilium pods don't expose their 9965 port, Prometheus can't collect metrics through hubble-metrics.

Proposal

In order to enable Prometheus for hubble using not only helm but also using cilium-cli, we should update cilium daemonset adding the port for hubble-metrics when we run cilium enable bubble --helm-set hubble.metrics.enabled={...}. I assume that the adding code will be similar to updateConfigMap.

update: I found a similar issue: #412 .

Shunpoco avatar Dec 19 '22 06:12 Shunpoco

If the proposal is reasonable for you, I'd like to make a PR to fix the problem!

Shunpoco avatar Dec 19 '22 06:12 Shunpoco

Thanks for the report!

If the proposal is reasonable for you, I'd like to make a PR to fix the problem!

It does seem reasonable to me and we'd sure have a look at a PR that fixes this issue!

rolinh avatar Jan 30 '23 13:01 rolinh

Thanks! If you have a time, please assign me to this issue!

Shunpoco avatar Feb 02 '23 14:02 Shunpoco

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

github-actions[bot] avatar Sep 28 '25 02:09 github-actions[bot]

This issue has not seen any activity since it was marked stale. Closing.

github-actions[bot] avatar Oct 13 '25 02:10 github-actions[bot]