How to restart all service of k8s and openpai ?
My dev machine and master worker restarted , then k8s and openpai was not started . What command shall I type on master worker to start the services(k8s and openpai)?
Thanks ! God bless you.
All services will start automatically. One possible situation is that you are using a swap partition. In this case, the cluster will not start automatically because k8s does not support swap.
All services will start automatically. One possible situation is that you are using a swap partition. In this case, the cluster will not start automatically because k8s does not support swap.
thanks! If I have to restart k8s and pai service, how shall I do ?
All services will start automatically. One possible situation is that you are using a swap partition. In this case, the cluster will not start automatically because k8s does not support swap.
thanks! If I have to restart k8s and pai service, how shall I do ?
Just disable swap from the system level.
All services will start automatically. One possible situation is that you are using a swap partition. In this case, the cluster will not start automatically because k8s does not support swap.
thanks! If I have to restart k8s and pai service, how shall I do ?
Just disable swap from the system level.
兄弟,我是问有没有办法重启整套k8s服务,不是问开关swap 交换内存啊。
All services will start automatically. One possible situation is that you are using a swap partition. In this case, the cluster will not start automatically because k8s does not support swap.
thanks! If I have to restart k8s and pai service, how shall I do ?
Just disable swap from the system level.
兄弟,我是问有没有办法重启整套k8s服务,不是问开关swap 交换内存啊。
sudo systemctl restart kubelet.service
All services will start automatically. One possible situation is that you are using a swap partition. In this case, the cluster will not start automatically because k8s does not support swap.
thanks! If I have to restart k8s and pai service, how shall I do ?
Just disable swap from the system level.
兄弟,我是问有没有办法重启整套k8s服务,不是问开关swap 交换内存啊。
sudo systemctl restart kubelet.service
But you cannot start kubelet without disabling swap.
All services will start automatically. One possible situation is that you are using a swap partition. In this case, the cluster will not start automatically because k8s does not support swap.
thanks! If I have to restart k8s and pai service, how shall I do ?
You can't. k8s is meant to run forever, and there is no such a 'one-click' restart. If other pods on k8s, not only pai, also failed to start, even the kube-apiserver(s), then perhaps your cluster was corrupted. You have to check the logs to see what happened
All services will start automatically. One possible situation is that you are using a swap partition. In this case, the cluster will not start automatically because k8s does not support swap.
thanks! If I have to restart k8s and pai service, how shall I do ?
You can't. k8s is meant to run forever, and there is no such a 'one-click' restart. If other pods on k8s, not only pai, also failed to start, even the kube-apiserver(s), then perhaps your cluster was corrupted. You have to check the logs to see what happened
thanks, I understand.
I reboot one of the nodes in our cluster, but the service on this node didn't resume. I checked the services and got this
sudo docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5098da9556d9 openpai/storage-manager "/usr/bin/entrypoint…" 8 hours ago Exited (137) About an hour ago k8s_storage-manager_storage-manager-ds-5z7vp_default_fd8f91fc-df77-4969-a1a1-a47eb3fef555_0
41e1dfecb009 mirrorgooglecontainers/pause-amd64:3.1 "/pause" 8 hours ago Exited (0) About an hour ago k8s_POD_storage-manager-ds-5z7vp_default_fd8f91fc-df77-4969-a1a1-a47eb3fef555_0
894baf6e53d4 openpai/node-exporter "/bin/node_exporter …" 8 hours ago Exited (2) About an hour ago k8s_node-exporter_node-exporter-h92vc_default_bb11a9f2-15a1-4cb3-bf1f-6cbf69c1806e_0
9b6bfe70aa34 mirrorgooglecontainers/pause-amd64:3.1 "/pause" 8 hours ago Exited (0) About an hour ago k8s_POD_node-exporter-h92vc_default_bb11a9f2-15a1-4cb3-bf1f-6cbf69c1806e_0
8432a3d1fffd openpai/log-manager-nginx "/usr/local/openrest…" 8 hours ago Exited (0) About an hour ago k8s_log-manager-nginx_log-manager-ds-5q2gk_default_a371ef21-b09b-4707-907c-0b1035a3ae4e_0
4cea45e035ef openpai/log-manager-cleaner "/sbin/tini -- /usr/…" 8 hours ago Exited (143) About an hour ago k8s_log-cleaner_log-manager-ds-5q2gk_default_a371ef21-b09b-4707-907c-0b1035a3ae4e_0
3b9990b8b051 mirrorgooglecontainers/pause-amd64:3.1 "/pause"
...
I tried to restart the cluster with ./paictl service stop && ./paictl service start . However, the situation remains the same.
If I try to launch the docker containers manually, I got error like this
sudo docker start k8s_calico-node_calico-node-fpw4r_kube-system_d8279838-5f79-4a0a-8045-b97b79176bf2_7
Error response from daemon: cannot join network of a non running container: 618600369a1b0a08048ba229a4a3aa266a911ebbec452ee28bf3d03a5ea1e8db
Error: failed to start containers: k8s_calico-node_calico-node-fpw4r_kube-system_d8279838-5f79-4a0a-8045-b97b79176bf2_7
Then I also noticed the following state
All services will start automatically. One possible situation is that you are using a swap partition. In this case, the cluster will not start automatically because k8s does not support swap.
So I try the following operation
sudo swapoff -a
sudo systemctl restart kubelet.service
Luckily, the services resumed and the node got online again. I write down this in case anyone run into this kind of problems like me ;)
I reboot one of the nodes in our cluster, but the service on this node didn't resume. I checked the services and got this
sudo docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 5098da9556d9 openpai/storage-manager "/usr/bin/entrypoint…" 8 hours ago Exited (137) About an hour ago k8s_storage-manager_storage-manager-ds-5z7vp_default_fd8f91fc-df77-4969-a1a1-a47eb3fef555_0 41e1dfecb009 mirrorgooglecontainers/pause-amd64:3.1 "/pause" 8 hours ago Exited (0) About an hour ago k8s_POD_storage-manager-ds-5z7vp_default_fd8f91fc-df77-4969-a1a1-a47eb3fef555_0 894baf6e53d4 openpai/node-exporter "/bin/node_exporter …" 8 hours ago Exited (2) About an hour ago k8s_node-exporter_node-exporter-h92vc_default_bb11a9f2-15a1-4cb3-bf1f-6cbf69c1806e_0 9b6bfe70aa34 mirrorgooglecontainers/pause-amd64:3.1 "/pause" 8 hours ago Exited (0) About an hour ago k8s_POD_node-exporter-h92vc_default_bb11a9f2-15a1-4cb3-bf1f-6cbf69c1806e_0 8432a3d1fffd openpai/log-manager-nginx "/usr/local/openrest…" 8 hours ago Exited (0) About an hour ago k8s_log-manager-nginx_log-manager-ds-5q2gk_default_a371ef21-b09b-4707-907c-0b1035a3ae4e_0 4cea45e035ef openpai/log-manager-cleaner "/sbin/tini -- /usr/…" 8 hours ago Exited (143) About an hour ago k8s_log-cleaner_log-manager-ds-5q2gk_default_a371ef21-b09b-4707-907c-0b1035a3ae4e_0 3b9990b8b051 mirrorgooglecontainers/pause-amd64:3.1 "/pause" ...I tried to restart the cluster with
./paictl service stop && ./paictl service start. However, the situation remains the same.If I try to launch the docker containers manually, I got error like this
sudo docker start k8s_calico-node_calico-node-fpw4r_kube-system_d8279838-5f79-4a0a-8045-b97b79176bf2_7 Error response from daemon: cannot join network of a non running container: 618600369a1b0a08048ba229a4a3aa266a911ebbec452ee28bf3d03a5ea1e8db Error: failed to start containers: k8s_calico-node_calico-node-fpw4r_kube-system_d8279838-5f79-4a0a-8045-b97b79176bf2_7Then I also noticed the following state
All services will start automatically. One possible situation is that you are using a swap partition. In this case, the cluster will not start automatically because k8s does not support swap.
So I try the following operation
sudo swapoff -a sudo systemctl restart kubelet.serviceLuckily, the services resumed and the node got online again. I write down this in case anyone run into this kind of problems like me ;)
You need to permanently close the swap, otherwise, any node will encounter this issue again after restarting.