postgres-operator icon indicating copy to clipboard operation
postgres-operator copied to clipboard

Recommended way to monitor disk usage

Open RossVertizan opened this issue 3 years ago • 3 comments

Please, answer some short questions which should help us to understand your problem / question better?

  • Which image of the operator are you using? v1.7.0
  • Where do you run it - cloud or metal? Kubernetes or OpenShift? Google Kubernetes Engine (GKE)
  • Are you running Postgres Operator in production? No
  • Type of issue? Question

I'm hoping this is a simple question with a simple answer. Is there a recommended way to monitor disk space usage? I understand that one can use pg_database_size and related commands however, this does not (as far as I can see) include the disk space used by the log files. To truly see the disk space being used one must use something like df -h.

This is OK interactively but how to include this in a monitoring script? There are solutions on StackOverflow such as this one, but how would one enable the cron job in the pod? Before I start hacking I thought I would ask if there is an 'official' way to do this. I have looked through the docs but I didn't find anything that appears to address this question.

Of course, the reason I would like to monitor disk space usage is because the database stops working when it runs out of disk space.

Thanks for any suggestions.

RossVertizan avatar May 26 '22 11:05 RossVertizan

We do this by periodically calling bg_mon rest endpoint on port 8080 with ZMON.

FxKu avatar May 30 '22 09:05 FxKu

I came to the same need. Cadvisor do not provide pvc usage/free metrics yet with containerd. So as workaround, we have prometheus node_exporter daemon set + kube-state metrics. Combining node_filesystem_avail_bytes,kube_persistentvolumeclaim_info,kube_pod_spec_volumes_persistentvolumeclaims_info gives available bytes on pvc for all pods/pvc. Query is not nice, but gives what I need. Working well with EKS and one on-premise K8s cluster. When combined with kube_persistentvolumeclaim_resource_requests_storage_bytes you can get also percentage of used space. sum without (device,instance,mountpoint,uid, account, fstype,Namespace,app,chart,component,controller_revision_hash,heritage,job,pod_template_generation,release,region) (( kube_pod_spec_volumes_persistentvolumeclaims_info{k8s_cluster=~".+"} * on (persistentvolumeclaim,k8s_cluster) group_left(volumename) kube_persistentvolumeclaim_info{} ) * on (uid, volumename,k8s_cluster) group_right(persistentvolumeclaim,pod,volume) label_replace( label_replace(node_filesystem_avail_bytes{mountpoint=~".*(pvc-[a-z0-9\\-]*).*"},"volumename","$1","mountpoint",".*(pvc-[a-z0-9\\-]*).*"), "uid","$1","mountpoint",".*/([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})/.*") )/1024/1024/1024 I also excluded some IMHO not needed volumes from node_exporter monitoring: collector.filesystem.mount-points-exclude: ^/(dev|proc|run.*|sys|etc.*|opt|local|mnt|var/lib/docker/.+|var/lib/containers/storage/.+|boot.*|(local/)?var/lib/bottlerocket|(local/)?var/lib/.*(kubernetes.io~projected|kubernetes.io~secret|kubernetes.io~empty-dir|volume-subpaths).*)($|/)

hau21um avatar May 30 '22 10:05 hau21um