dolphinscheduler
dolphinscheduler copied to clipboard
[Bug] [worker-server] Failed to find any Kerberos tgt
Search before asking
- [X] I had searched in the issues and found no similar issues.
What happened
I found related issues in the issue list, but they don't seem to be resolved.
I'm using version 3.1.9
What you expected to happen
[LOG-PATH]: /home/dolphinscheduler/dolphinscheduler/worker-server/logs/20240406/11973263249344_40-18716-44494.log, [HOST]: Host{address='10.111.15.56:1234', ip='10.111.15.56', port=1234}
[INFO] 2024-04-06 04:00:36.346 +0000 - Begin to pulling task
[INFO] 2024-04-06 04:00:36.347 +0000 - Begin to initialize task
[INFO] 2024-04-06 04:00:36.347 +0000 - Set task startTime: Sat Apr 06 04:00:36 UTC 2024
[INFO] 2024-04-06 04:00:36.348 +0000 - Set task envFile: /home/dolphinscheduler/dolphinscheduler/worker-server/conf/dolphinscheduler_env.sh
[INFO] 2024-04-06 04:00:36.348 +0000 - Set task appId: 18716_44494
[INFO] 2024-04-06 04:00:36.348 +0000 - End initialize task
[INFO] 2024-04-06 04:00:36.348 +0000 - Set task status to TaskExecutionStatus{code=1, desc='running'}
[INFO] 2024-04-06 04:00:36.349 +0000 - TenantCode:root check success
[INFO] 2024-04-06 04:00:36.350 +0000 - ProcessExecDir:/tmp/dolphinscheduler/exec/process/root/10511172294464/11973263249344_40/18716/44494 check success
[INFO] 2024-04-06 04:00:36.350 +0000 - get resource file from path:/dolphinscheduler/root/resources/cua/kerberos/cuayilinghsd/cuayilinghsd.keytab
[ERROR] 2024-04-06 04:00:36.367 +0000 - Task execute failed, due to meet an exception
org.apache.dolphinscheduler.plugin.task.api.TaskException: Download resource file: (/cua/kerberos/cuayilinghsd/cuayilinghsd.keytab,root) error
at org.apache.dolphinscheduler.server.worker.utils.TaskExecutionCheckerUtils.downloadResourcesIfNeeded(TaskExecutionCheckerUtils.java:136)
at org.apache.dolphinscheduler.server.worker.runner.WorkerTaskExecuteRunnable.beforeExecute(WorkerTaskExecuteRunnable.java:216)
at org.apache.dolphinscheduler.server.worker.runner.WorkerTaskExecuteRunnable.run(WorkerTaskExecuteRunnable.java:170)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:74)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: DestHost:destPort bigdata54.cua.internal:8020 , LocalHost:localPort bigdata56.cua.internal/10.111.15.56:0. Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:842)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:817)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1616)
at org.apache.hadoop.ipc.Client.call(Client.java:1558)
at org.apache.hadoop.ipc.Client.call(Client.java:1455)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
at com.sun.proxy.$Proxy129.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:910)
at sun.reflect.GeneratedMethodAccessor96.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:433)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
at com.sun.proxy.$Proxy130.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1679)
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1602)
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1599)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1614)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:483)
at org.apache.dolphinscheduler.service.storage.impl.HadoopUtils.copyHdfsToLocal(HadoopUtils.java:388)
at org.apache.dolphinscheduler.service.storage.impl.HadoopUtils.download(HadoopUtils.java:309)
at org.apache.dolphinscheduler.server.worker.utils.TaskExecutionCheckerUtils.downloadResourcesIfNeeded(TaskExecutionCheckerUtils.java:127)
... 9 common frames omitted
Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:798)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:752)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:856)
at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:414)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1677)
at org.apache.hadoop.ipc.Client.call(Client.java:1502)
... 32 common frames omitted
Caused by: javax.security.sasl.SaslException: GSS initiate failed
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:408)
at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:623)
at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:414)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:843)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:839)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:839)
... 35 common frames omitted
Caused by: org.ietf.jgss.GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:148)
at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:189)
at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
... 44 common frames omitted
[INFO] 2024-04-06 04:00:36.371 +0000 - Get a exception when execute the task, will send the task execute result to master, the current task execute result is TaskExecutionStatus{code=6, desc='failure'}
How to reproduce
Kerberos credentials are not updated in time or the credentials are not updated
Anything else
No response
Version
3.1.x
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
will be fix in 3.3.1