hive icon indicating copy to clipboard operation
hive copied to clipboard

HIVE-28473:INSERT OVERWRITE LOCAL DIRECTORY writes staging files to wrong hdfs directory

Open liangyu-1 opened this issue 1 year ago • 2 comments

What changes were proposed in this pull request?

I modified org.apache.hadoop.hive.ql.parse.SemanticAnalyzer#createFileSinkDesc.

Why are the changes needed?

As described in HIVE-28473.

When I run sql: INSERT OVERWRITE LOCAL DIRECTORY '/path/to/local' select key, value from src group by key, value

Hive will try to create the new staging directory 'hdfs://path/to/local/.hive-staging-sessionID-xxxx', and we don't have sufficient permissions to create this directory, then hive will throw exception: RuntimeException: cannot create staging directory "hdfs:/path/to/local/dir/.hive-staging-xx": Permission denied: user=aaa, access=WRITE, inode="/":hdfs:hdfs:drwxr-xr-x

I modified org.apache.hadoop.hive.ql.parse.SemanticAnalyzer#createFileSinkDesc to fix this bug

Does this PR introduce any user-facing change?

No

Is the change a dependency upgrade?

No

How was this patch tested?

I modified qfile test ql/src/test/queries/clientpositive/insert_overwrite_local_directory_1.q.

to implement this qfile test, we need to First, modify the qfile ql/src/test/queries/clientpositive/insert_overwrite_local_directory_1.q, uncomment the sql --set mapreduce.framework.name=yarn;

Second, run the qfile test, and look up the keyword "New scratch dir is". This means we have fixed the bug that hive doesn't get the correct staging directory path when we try to INSERT OVERWRITE LOCAL DIRECTORY.

If we don't modify org.apache.hadoop.hive.ql.parse.SemanticAnalyzer#createFileSinkDesc, we will not find the keyword "New scratch dir is" because hive generates the staging directory path from the "local dest path" which does not exist on hdfs.

liangyu-1 avatar Aug 26 '24 09:08 liangyu-1

@zhangbutao Hi, I have just modified my code and it passed the CI, would you please help me check this again? thanks

liangyu-1 avatar Sep 20 '24 02:09 liangyu-1

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the [email protected] list if the patch is in need of reviews.

github-actions[bot] avatar Nov 27 '24 00:11 github-actions[bot]

@ayushtkn @zhangbutao Hi, can you please help me check this issue? I don't know how to write the Unit Tests for this case, but this really happens in production environment.

liangyu-1 avatar Dec 16 '24 03:12 liangyu-1

I just tested your PR in my local deployment Hive cluster(built with latest master branch):

INSERT OVERWRITE LOCAL DIRECTORY "/path/to/local/dir" select * from testdb.tbl;

I found in the end of Move stage, there is still a permission exception, and try to create the HDFS directory /path/to/local/dir :

org.apache.hadoop.security.AccessControlException: Permission denied: user=hive, access=WRITE, inode="/":root:supergroup:drwxr-xr-x
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:506)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:346)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermissionWithContext(FSPermissionChecker.java:370)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:240)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1943)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1927)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1886)
        at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:60)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3441)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1167)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:742)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:621)
        at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:589)
        at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1227)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1094)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1017)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3048)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121)
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88)
        at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2509)
        at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2483)
        at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1498)
        at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1495)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:1512)
        at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:1487)
        at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:2496)
        at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:443)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105)
        at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:354)
        at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:327)
        at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:244)
        at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:105)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:346)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:190)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:143)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:138)
        at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:190)
        at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:234)
        at org.apache.hive.service.cli.operation.SQLOperation.access$500(SQLOperation.java:88)
        at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:334)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
        at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:354)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

MoveTask_Error

zhangbutao avatar Dec 18 '24 04:12 zhangbutao

I changed the MoveTask.java, to skip create HDFS directory /path/to/local/dir if targetPath is local. So, you can change your PR like this :

diff --git a/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java b/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java
index 2721977d6f..6ab137b418 100644
--- a/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java
+++ b/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java
@@ -438,9 +438,11 @@ public int execute() {
             }
           }
           else {
-            FileSystem targetFs = targetPath.getFileSystem(conf);
-            if (!targetFs.exists(targetPath.getParent())){
-              targetFs.mkdirs(targetPath.getParent());
+            if (lfd.getIsDfsDir()) {
+              FileSystem targetFs = targetPath.getFileSystem(conf);
+              if (!targetFs.exists(targetPath.getParent())) {
+                targetFs.mkdirs(targetPath.getParent());
+              }
             }
             moveFile(sourcePath, targetPath, lfd.getIsDfsDir());
           }
diff --git a/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java b/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
index 5f81157124..04ddb857c0 100644
--- a/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
+++ b/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
@@ -8468,6 +8468,7 @@ private FileSinkDesc createFileSinkDesc(String dest, TableDesc table_desc,
                                           RowSchema fsRS, boolean canBeMerged, Table dest_tab, boolean isMmCtas,
                                           Integer dest_type, QB qb, boolean isDirectInsert, AcidUtils.Operation acidOperation, String moveTaskId) throws SemanticException {
     boolean isInsertOverwrite = false;
+    boolean isLocal = false;
     Context.Operation writeOperation = getWriteOperation(dest);
     switch (dest_type) {
     case QBMetaData.DEST_PARTITION:
@@ -8492,6 +8493,7 @@ private FileSinkDesc createFileSinkDesc(String dest, TableDesc table_desc,

       break;
     case QBMetaData.DEST_LOCAL_FILE:
+      isLocal = true;
     case QBMetaData.DEST_DFS_FILE:
       //CTAS path or insert into file/directory
       break;
@@ -8545,7 +8547,12 @@ private FileSinkDesc createFileSinkDesc(String dest, TableDesc table_desc,
     fileSinkDesc.setStatsAggPrefix(fileSinkDesc.getDirName().toString());
     if (!destTableIsMaterialization &&
         HiveConf.getVar(conf, HIVE_STATS_DBCLASS).equalsIgnoreCase(StatDB.fs.name())) {
-      String statsTmpLoc = ctx.getTempDirForInterimJobPath(dest_path).toString();
+      String statsTmpLoc;
+      if (isLocal){
+        statsTmpLoc = ctx.getMRTmpPath().toString();
+      } else {
+        statsTmpLoc = ctx.getTempDirForInterimJobPath(dest_path).toString();
+      }
       fileSinkDesc.setStatsTmpDir(statsTmpLoc);
       LOG.debug("Set stats collection dir : " + statsTmpLoc);
     }

zhangbutao avatar Dec 18 '24 06:12 zhangbutao

BTW, this issue is related hdfs permission which is difficult to reproduce by Unit Tests. So you no need to write Unit Tests in this PR.

zhangbutao avatar Dec 18 '24 06:12 zhangbutao

@zhangbutao thanks for your help. I have just fixed the code and this time it will solved this problem.

liangyu-1 avatar Dec 19 '24 01:12 liangyu-1