load-watcher icon indicating copy to clipboard operation
load-watcher copied to clipboard

Load watcher is a cluster-wide aggregator of metrics, developed for Trimaran: Real Load Aware Scheduler in Kubernetes.

Results 12 load-watcher issues
Sort by recently updated
recently updated
newest added

version: load-watcher v0.2.3 prometheus v2.50.1 curl 10.105.174.136:2020/watcher ![image](https://github.com/paypal/load-watcher/assets/42016493/5b1b90c2-8c02-4055-973f-625369e27d88) Self-monitored data ![image](https://github.com/paypal/load-watcher/assets/42016493/3fe4704b-4db2-49ef-b20c-8b3872e65200) ![image](https://github.com/paypal/load-watcher/assets/42016493/44093b18-19cf-47a9-b478-bd72f3c97c61) [https://github.com/paypal/load-watcher/issues/51](url) This suggestion should be open to consideration

Hi all, I found this line in README: `kubectl create -f manifests/load-watcher-deployment.yaml` but I did not find the `manifests/load-watcher-deployment.yaml` in repo. Maybe a sample deployment file is needed? thanks.

Resolve go: unsupported GOOS/GOARCH pair linux/aarch64

* allow users to change filter keys (for host & cluster name) in signalfx * use pointers instead of values for metrics clients. TEST DONE: * verify metrics pulled to...

The current load-watcher Prometheus pkg was using the metric of `instance:node_cpu:ratio` to calculate the node utilization However, when this value is still below 60%, I found another metric `instance:node_cpu_utilisation:rate1m` was...

enhancement
question

It took me some time to find out what exactly `instance:node_cpu:ratio` metirc is. It seems cpu and memory metric is come from [helm-charts/charts/kube-prometheus-stack/templates/prometheus/rules/kube-prometheus-node-recording.rules.yaml](https://github.com/prometheus-community/helm-charts/blob/c4a7d10fdc6a0f694d9b97e9446207ba67d997dd/charts/kube-prometheus-stack/templates/prometheus/rules/kube-prometheus-node-recording.rules.yaml) rule which is is removed and seems...

Currently, it is confusing to know which load-watcher version is compatible with which kube-scheduler/scheduler-plugins version. We should have a table to declare the release compatibility.

documentation

Currently, no tests exist for each metric provider. These need to be added for code coverage and resilient clients.

enhancement

It will be nice to have contribution guidelines defined. Also, a script to check basic code formatting issues will save time in PR reviews and avoid unintentional overlooks. This can...

documentation
enhancement

time="2025-01-12T05:05:16Z" level=error msg="received error while fetching metrics: nodes is forbidden: User \"system:serviceaccount:loadwatcher:default\" cannot list resource \"nodes\" in API group \"\" at the cluster scope" func="github.com/paypal/load-watcher/pkg/watcher.(*Watcher).StartWatching.func1" file="/go/src/github.com/paypal/load-watcher/pkg/watcher/watcher.go:136" time="2025-01-12T05:05:16Z" level=error msg="received error...