incubator-streampark icon indicating copy to clipboard operation
incubator-streampark copied to clipboard

restart flinksql The actual failed but option logs show start status success after last flinksql has error and run status failed

Open huangkaiyan10 opened this issue 3 years ago • 6 comments

Search before asking

  • [X] I had searched in the issues and found no similar issues.

What happened

restart flinksql The actual failed but option logs show start status success after last flinksql has error and run status failed 。i use flink nactive on kubernetes ,In this scenario, there is no log to find out why actual failed. At the same time ,kubectl get pod -A can not find any flinksql k8s pod ,and cannot use kebectl logs podid to find out why actual failed.

StreamPark Version

1.2.4-dev

Java Version

jdk8

Flink Version

1.14.5-2.12

Scala Version of Flink

2.12

Error Exception

there is no log

Screenshots

image

image

Are you willing to submit PR?

  • [X] Yes I am willing to submit a PR!

Code of Conduct

huangkaiyan10 avatar Oct 26 '22 01:10 huangkaiyan10

[root@server237 logs]# cat warn.2022-10-26.log 2022-10-26 09:11:45.310 StreamPark [streampark-deploy-executor-3] WARN o.a.flink.core.plugin.PluginConfig:69 - The plugins directory [plugins] does not exist. 2022-10-26 09:11:45.319 StreamPark [streampark-deploy-executor-3] WARN o.a.flink.core.plugin.PluginConfig:69 - The plugins directory [plugins] does not exist. 2022-10-26 09:11:50.230 StreamPark [ForkJoinPool-1-worker-51] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:220 - Failed to visit remote flink jobs on kubernetes-native-mode cluster, and the retry access logic is performed. 2022-10-26 09:11:50.409 StreamPark [ForkJoinPool-1-worker-51] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:227 - The retry fetch failed, final status failed, errorStack=Connect to http://172.18.17.222:56651 [/172.18.17.222] failed: 拒绝连接. 2022-10-26 09:11:55.036 StreamPark [ForkJoinPool-1-worker-51] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:220 - Failed to visit remote flink jobs on kubernetes-native-mode cluster, and the retry access logic is performed. 2022-10-26 09:11:55.226 StreamPark [ForkJoinPool-1-worker-51] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:227 - The retry fetch failed, final status failed, errorStack=Connect to http://172.18.17.222:56651 [/172.18.17.222] failed: 拒绝连接. 2022-10-26 09:12:00.035 StreamPark [ForkJoinPool-1-worker-30] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:220 - Failed to visit remote flink jobs on kubernetes-native-mode cluster, and the retry access logic is performed. 2022-10-26 09:12:00.203 StreamPark [ForkJoinPool-1-worker-30] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:227 - The retry fetch failed, final status failed, errorStack=Connect to http://172.18.17.222:56651 [/172.18.17.222] failed: 拒绝连接. 2022-10-26 09:12:00.377 StreamPark [streampark-deploy-executor-4] WARN o.a.flink.core.plugin.PluginConfig:69 - The plugins directory [plugins] does not exist. 2022-10-26 09:12:00.385 StreamPark [streampark-deploy-executor-4] WARN o.a.flink.core.plugin.PluginConfig:69 - The plugins directory [plugins] does not exist. 2022-10-26 09:12:05.223 StreamPark [ForkJoinPool-1-worker-44] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:220 - Failed to visit remote flink jobs on kubernetes-native-mode cluster, and the retry access logic is performed. 2022-10-26 09:12:05.410 StreamPark [ForkJoinPool-1-worker-44] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:227 - The retry fetch failed, final status failed, errorStack=Connect to http://172.18.17.222:59745 [/172.18.17.222] failed: 拒绝连接. 2022-10-26 09:12:10.039 StreamPark [ForkJoinPool-1-worker-51] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:220 - Failed to visit remote flink jobs on kubernetes-native-mode cluster, and the retry access logic is performed. 2022-10-26 09:12:10.232 StreamPark [ForkJoinPool-1-worker-51] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:227 - The retry fetch failed, final status failed, errorStack=Connect to http://172.18.17.222:59745 [/172.18.17.222] failed: 拒绝连接. 2022-10-26 09:12:15.038 StreamPark [ForkJoinPool-1-worker-23] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:220 - Failed to visit remote flink jobs on kubernetes-native-mode cluster, and the retry access logic is performed. 2022-10-26 09:12:15.234 StreamPark [ForkJoinPool-1-worker-23] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:227 - The retry fetch failed, final status failed, errorStack=Connect to http://172.18.17.222:56651 [/172.18.17.222] failed: 拒绝连接. 2022-10-26 09:12:20.035 StreamPark [ForkJoinPool-1-worker-51] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:220 - Failed to visit remote flink jobs on kubernetes-native-mode cluster, and the retry access logic is performed. 2022-10-26 09:12:20.220 StreamPark [ForkJoinPool-1-worker-51] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:227 - The retry fetch failed, final status failed, errorStack=Connect to http://172.18.17.222:56651 [/172.18.17.222] failed: 拒绝连接. 2022-10-26 09:46:35.230 StreamPark [streampark-deploy-executor-5] WARN o.a.flink.core.plugin.PluginConfig:69 - The plugins directory [plugins] does not exist. 2022-10-26 09:46:35.237 StreamPark [streampark-deploy-executor-5] WARN o.a.flink.core.plugin.PluginConfig:69 - The plugins directory [plugins] does not exist. 2022-10-26 09:46:40.210 StreamPark [ForkJoinPool-1-worker-16] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:220 - Failed to visit remote flink jobs on kubernetes-native-mode cluster, and the retry access logic is performed. 2022-10-26 09:46:40.397 StreamPark [ForkJoinPool-1-worker-16] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:227 - The retry fetch failed, final status failed, errorStack=Connect to http://172.18.17.222:64399 [/172.18.17.222] failed: 拒绝连接. 2022-10-26 09:46:45.036 StreamPark [ForkJoinPool-1-worker-16] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:220 - Failed to visit remote flink jobs on kubernetes-native-mode cluster, and the retry access logic is performed. 2022-10-26 09:46:45.217 StreamPark [ForkJoinPool-1-worker-16] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:227 - The retry fetch failed, final status failed, errorStack=Connect to http://172.18.17.222:64399 [/172.18.17.222] failed: 拒绝连接. 2022-10-26 09:47:05.035 StreamPark [ForkJoinPool-1-worker-16] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:220 - Failed to visit remote flink jobs on kubernetes-native-mode cluster, and the retry access logic is performed. 2022-10-26 09:47:05.224 StreamPark [ForkJoinPool-1-worker-16] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:227 - The retry fetch failed, final status failed, errorStack=Connect to http://172.18.17.222:64399 [/172.18.17.222] failed: 拒绝连接. 2022-10-26 09:47:10.038 StreamPark [ForkJoinPool-1-worker-59] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:220 - Failed to visit remote flink jobs on kubernetes-native-mode cluster, and the retry access logic is performed. 2022-10-26 09:47:10.246 StreamPark [ForkJoinPool-1-worker-59] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:227 - The retry fetch failed, final status failed, errorStack=Connect to http://172.18.17.222:64399 [/172.18.17.222] failed: 拒绝连接. 2022-10-26 09:47:15.036 StreamPark [ForkJoinPool-1-worker-16] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:220 - Failed to visit remote flink jobs on kubernetes-native-mode cluster, and the retry access logic is performed. 2022-10-26 09:47:15.223 StreamPark [ForkJoinPool-1-worker-16] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:227 - The retry fetch failed, final status failed, errorStack=Connect to http://172.18.17.222:64399 [/172.18.17.222] failed: 拒绝连接. 2022-10-26 09:47:30.038 StreamPark [ForkJoinPool-1-worker-16] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:220 - Failed to visit remote flink jobs on kubernetes-native-mode cluster, and the retry access logic is performed. 2022-10-26 09:47:30.231 StreamPark [ForkJoinPool-1-worker-16] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:227 - The retry fetch failed, final status failed, errorStack=Connect to http://172.18.17.222:64399 [/172.18.17.222] failed: 拒绝连接. 2022-10-26 09:47:35.039 StreamPark [ForkJoinPool-1-worker-2] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:220 - Failed to visit remote flink jobs on kubernetes-native-mode cluster, and the retry access logic is performed. 2022-10-26 09:47:35.229 StreamPark [ForkJoinPool-1-worker-2] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:227 - The retry fetch failed, final status failed, errorStack=Connect to http://172.18.17.222:64399 [/172.18.17.222] failed: 拒绝连接. 2022-10-26 09:47:40.043 StreamPark [ForkJoinPool-1-worker-16] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:220 - Failed to visit remote flink jobs on kubernetes-native-mode cluster, and the retry access logic is performed. 2022-10-26 09:47:40.264 StreamPark [ForkJoinPool-1-worker-16] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:227 - The retry fetch failed, final status failed, errorStack=Connect to http://172.18.17.222:64399 [/172.18.17.222] failed: 拒绝连接. 2022-10-26 09:47:45.035 StreamPark [ForkJoinPool-1-worker-2] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:220 - Failed to visit remote flink jobs on kubernetes-native-mode cluster, and the retry access logic is performed. 2022-10-26 09:47:45.225 StreamPark [ForkJoinPool-1-worker-2] WARN o.a.s.f.k.w.FlinkJobStatusWatcher:227 - The retry fetch failed, final status failed, errorStack=Connect to http://172.18.17.222:64399 [/172.18.17.222] failed: 拒绝连接.

huangkaiyan10 avatar Oct 26 '22 01:10 huangkaiyan10

at this streampark deploy host,kubectl get node and kubectl logs is ok

huangkaiyan10 avatar Oct 26 '22 01:10 huangkaiyan10

c9ef92f7f3156e6ed4008e0913c4781 According to the k8s event, the flinksql job is started. Maybe the flinksql job still has an error and failed to run. k8s pod was simply deleted.

huangkaiyan10 avatar Oct 26 '22 01:10 huangkaiyan10

In this case, do not delete k8s pod. In the streampark web ui , add a button to force deletion to delete? Or delete the failed k8s pod before the next restart

huangkaiyan10 avatar Oct 26 '22 02:10 huangkaiyan10

In this case(flink nactive on kubernetes), do not delete k8s pod. In the streampark web ui , add a button to force deletion to delete? Or delete the failed k8s pod before the next restart

huangkaiyan10 avatar Oct 26 '22 02:10 huangkaiyan10

3d4e523d2e1ccc80181f73501e22cdf

huangkaiyan10 avatar Oct 26 '22 02:10 huangkaiyan10