dolphinscheduler icon indicating copy to clipboard operation
dolphinscheduler copied to clipboard

[Bug] [worker-server] Failed to find any Kerberos tgt

Open StarGods opened this issue 1 year ago • 1 comments

Search before asking

  • [X] I had searched in the issues and found no similar issues.

What happened

#14439

I found related issues in the issue list, but they don't seem to be resolved.

I'm using version 3.1.9

What you expected to happen



[LOG-PATH]: /home/dolphinscheduler/dolphinscheduler/worker-server/logs/20240406/11973263249344_40-18716-44494.log, [HOST]:  Host{address='10.111.15.56:1234', ip='10.111.15.56', port=1234}
[INFO] 2024-04-06 04:00:36.346 +0000 - Begin to pulling task
[INFO] 2024-04-06 04:00:36.347 +0000 - Begin to initialize task
[INFO] 2024-04-06 04:00:36.347 +0000 - Set task startTime: Sat Apr 06 04:00:36 UTC 2024
[INFO] 2024-04-06 04:00:36.348 +0000 - Set task envFile: /home/dolphinscheduler/dolphinscheduler/worker-server/conf/dolphinscheduler_env.sh
[INFO] 2024-04-06 04:00:36.348 +0000 - Set task appId: 18716_44494
[INFO] 2024-04-06 04:00:36.348 +0000 - End initialize task
[INFO] 2024-04-06 04:00:36.348 +0000 - Set task status to TaskExecutionStatus{code=1, desc='running'}
[INFO] 2024-04-06 04:00:36.349 +0000 - TenantCode:root check success
[INFO] 2024-04-06 04:00:36.350 +0000 - ProcessExecDir:/tmp/dolphinscheduler/exec/process/root/10511172294464/11973263249344_40/18716/44494 check success
[INFO] 2024-04-06 04:00:36.350 +0000 - get resource file from path:/dolphinscheduler/root/resources/cua/kerberos/cuayilinghsd/cuayilinghsd.keytab
[ERROR] 2024-04-06 04:00:36.367 +0000 - Task execute failed, due to meet an exception
org.apache.dolphinscheduler.plugin.task.api.TaskException: Download resource file: (/cua/kerberos/cuayilinghsd/cuayilinghsd.keytab,root) error
	at org.apache.dolphinscheduler.server.worker.utils.TaskExecutionCheckerUtils.downloadResourcesIfNeeded(TaskExecutionCheckerUtils.java:136)
	at org.apache.dolphinscheduler.server.worker.runner.WorkerTaskExecuteRunnable.beforeExecute(WorkerTaskExecuteRunnable.java:216)
	at org.apache.dolphinscheduler.server.worker.runner.WorkerTaskExecuteRunnable.run(WorkerTaskExecuteRunnable.java:170)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
	at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:74)
	at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: DestHost:destPort bigdata54.cua.internal:8020 , LocalHost:localPort bigdata56.cua.internal/10.111.15.56:0. Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:842)
	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:817)
	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1616)
	at org.apache.hadoop.ipc.Client.call(Client.java:1558)
	at org.apache.hadoop.ipc.Client.call(Client.java:1455)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
	at com.sun.proxy.$Proxy129.getFileInfo(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:910)
	at sun.reflect.GeneratedMethodAccessor96.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:433)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
	at com.sun.proxy.$Proxy130.getFileInfo(Unknown Source)
	at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1679)
	at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1602)
	at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1599)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1614)
	at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:483)
	at org.apache.dolphinscheduler.service.storage.impl.HadoopUtils.copyHdfsToLocal(HadoopUtils.java:388)
	at org.apache.dolphinscheduler.service.storage.impl.HadoopUtils.download(HadoopUtils.java:309)
	at org.apache.dolphinscheduler.server.worker.utils.TaskExecutionCheckerUtils.downloadResourcesIfNeeded(TaskExecutionCheckerUtils.java:127)
	... 9 common frames omitted
Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
	at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:798)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
	at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:752)
	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:856)
	at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:414)
	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1677)
	at org.apache.hadoop.ipc.Client.call(Client.java:1502)
	... 32 common frames omitted
Caused by: javax.security.sasl.SaslException: GSS initiate failed
	at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
	at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:408)
	at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:623)
	at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:414)
	at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:843)
	at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:839)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:839)
	... 35 common frames omitted
Caused by: org.ietf.jgss.GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
	at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:148)
	at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
	at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:189)
	at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
	at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
	at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
	at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
	... 44 common frames omitted
[INFO] 2024-04-06 04:00:36.371 +0000 - Get a exception when execute the task, will send the task execute result to master, the current task execute result is TaskExecutionStatus{code=6, desc='failure'}

How to reproduce

Kerberos credentials are not updated in time or the credentials are not updated

Anything else

No response

Version

3.1.x

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

StarGods avatar Apr 09 '24 09:04 StarGods

will be fix in 3.3.1

wuzhenhua01 avatar Aug 05 '25 14:08 wuzhenhua01