<install> statefulset , FailedScheduling
It it not easy to install robusta for me, when I install robusta using helm, they can not start
alertmanager-robusta-kube-prometheus-st-alertmanager-0 0/2 Pending 0 18s
prometheus-robusta-kube-prometheus-st-prometheus-0 0/2 Pending 0 8s
the Event is: Warning FailedScheduling 49s default-scheduler 0/6 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling.
and kubectl get pv shows nothing
what's wrong with it?
Hi 👋, thanks for opening an issue! Please note, it may take some time for us to respond, but we'll get back to you as soon as we can!
it seems no storageclass and no PV
apiVersion: v1 kind: PersistentVolumeClaim metadata: creationTimestamp: "2024-12-01T11:55:07Z" finalizers:
- kubernetes.io/pvc-protection labels: alertmanager: robusta-kube-prometheus-st-alertmanager app.kubernetes.io/instance: robusta-kube-prometheus-st-alertmanager app.kubernetes.io/managed-by: prometheus-operator app.kubernetes.io/name: alertmanager name: alertmanager-robusta-kube-prometheus-st-alertmanager-db-alertmanager-robusta-kube-prometheus-st-alertmanager-0 namespace: default resourceVersion: "398335" uid: 9c481da6-803e-43cf-8f34-c23137203bd0 spec: accessModes:
- ReadWriteOnce resources: requests: storage: 10Gi volumeMode: Filesystem status: phase: Pending
Hi @wiluen ,
Thanks for reporting this. Which k8s distribution are you using? Is it on-prem? public cloud (amazon, google, other)?
This might happen if the cluster doesn't have a storage provisioner (the component responsible to create a PV from the PVC)
How do you typically create persistent volumes ?
Can you share the output of:
kubectl get storageclass ?
thanks for your reply.
my k8s is On-Prem
yes, I don't have storageclass, kubectl get storageclass is nothing.
it is a lab cluster in campus. I just created a PV manually and it can bound PVC.
but another question is prometheus-robusta-kube-prometheus-st-prometheus-db-prometheus-robusta-kube-prometheus-st-prometheus-0 requires 100Gi , but my VMs don't have 100Gi disk, and I also can not edit the field of resources.requests.storage.
How can I do?
spec:
accessModes:
- ReadWriteOnce resources: requests: storage: 100Gi # I want a small value volumeMode: Filesystem
Hi @wiluen
you can change the storage size in the generated_values.yaml file:
kube-prometheus-stack:
prometheus:
prometheusSpec:
storageSpec:
volumeClaimTemplate:
spec:
resources:
requests:
storage: 10Gi
thanks very much! @arikalon1
there are a still bug about the images of glusterfs
docker pull quay.io/gluster/gluster-centos:latest
latest: Pulling from gluster/gluster-centos
[DEPRECATION NOTICE] Docker Image Format v1 and Docker Image manifest version 2, schema 1 support is disabled by default and will be removed in an upcoming release. Suggest the author of quay.io/gluster/gluster-centos:latest to upgrade the image to the OCI Format or Docker Image manifest v2, schema 2. More information at https://docs.docker.com/go/deprecated-image-specs/
@arikalon1
how to get all of the configurable field in generated_values.yaml
Hi @wiluen
Where do we have a reference to gluster-centos in Robusta?
Can you share more details?
Regarding the configuration options, you can see most of it in our defaults values.yaml file
It also has a dependency to the kube-prometheus-stack , you can also configure via the robusta generated_values.yaml file.
The config values of kube-prometheus-stack can be found here
Hi @arikalon1
actually I dont know what it was, but it appears in my k8s cluster. I thought this was part of the Robusta, and there are also some job when I deploy the Robusta, I dont know what is them, so in my picture, do I miss some important pods?
hey @wiluen
The gclusterfs looks like some daemon set, but it's not a part of Robusta
When Robusta starts, it runs an efficiency scan, krr.
You can later see the results in the UI.
It helps right sizing your k8s workloads (setting the correct resources requests and limits)
Looks like your robusta installation is up, and healthy!
hi @arikalon1 there are so many problems, Thank you for your patient answer
I finish the install, it is easy to install if there are no network problem. and use enablePrometheusStack: true
and deploy a crashing pod.
(1)but i cant connect to prometheus
(2)I see the logs of Pod prometheus-robusta-kube-prometheus-st-prometheus-0, there are error :
ts=2024-12-14T07:18:39.171Z caller=notifier.go:530 level=error component=notifier alertmanager=http://172.20.245.213:9093/api/v2/alerts count=1 msg="Error sending alert" err="Post "http://172.20.245.213:9093/api/v2/alerts": context deadline exceeded"
(3)besides,i see AI can do summary for logs, but in the UI of holmesgpt , it cant connect to the gpt
the summary of logs seems right.
Hi @wiluen
Do you have network policies in your cluster? The robusta components need to be able to connect to one another
Can you share the robusta-runner and alert manager logs? Is the IP prometheus is trying to connect to, really belongs to alert manager?
Hi @arikalon1
the log of robusta-runner is:
ERROR Couldn't connect to Prometheus found under http://robusta-kube-prometheus-st-prometheus.default.svc.cluster.local:9090
hey @wiluen
Can you share kubectl get pods -o wide so the pod ips are visible?
trying to check if this is indeed the alert manager pod ip
Do you have network policies defined in the cluster?
hi @arikalon1 the results is:
I don't think there are any additional network strategies because my cluster is just a simple testing cluster.
the ip seems right, but prometheus is not able to connect to alert manager In addition, looks like robusta-runner is not able to connect to prometheus or holmes
I suspect there's some networks restrictions in the cluster
Can you share:
kubectl get networkpolicies -A ?
nothing here
hi @arikalon1
what is the ip and port of alert manager and prometheus and holmes
I'm not sure. If I knew, I might have a way to solve it
you can see it on the pods list you shared