hdfs icon indicating copy to clipboard operation
hdfs copied to clipboard

Uploading more than once to a path adds it to subfolders

Open redspart opened this issue 6 years ago • 1 comments

I am not sure if this is indented please let me know if it is.

When uploading a directory first time it will add the data into the correct spot; i.e: hdfs-path/sub-folder. However, when trying to add more data to the same place it output it in the /hdfs_path/sub-folder/<local_name>/.

If this is not an intended output, I believe the culprit is here on line 553 where hdfs_path and local_name are joined. I removed the local_name on the join and it seemed to upload all data into hdfs_path while making no subfolders.

https://github.com/mtth/hdfs/blob/5b40065adbe1a5627b0b513daf13b41c9819a9be/hdfs/client.py#L553

EDIT

Coded used:

for p in files:
    file_path = "sub_folder"
    upload_path = "%s/%s" % ("/hdfs-path", "sub_folder")
    client.upload(upload_path, file_path, overwrite=True, n_threads=0)

After a bit more debugging, I found that if the path in hdfs exists, it will append the folder name in which the files are coming from. I need the files to be added to the specified directory and not to the directory + sub folder. To remedy this I created a new variable called use_existing. When True it will use the hdfs path and not the hdfs+local_name.

Again let me know if my understanding is off, or you would like a PR with the added variable.

redspart avatar Aug 06 '19 20:08 redspart

Thanks for the detailed report. Your understanding is correct. It is implemented this way to be consistent with local commands:

# In an empty directory
$ mkdir src1 src2
$ cp -r src1 dst # Copies src1 as dst
$ cp -r src2 dst # Copies src2 as dst/src2

As you point out, there is a usability gap though. You can achieve what you are trying to do locally by globbing (cp -r src2/* dst) but there is no equivalent here, at least until https://github.com/mtth/hdfs/issues/105. I think this justifies adding an option; if you send a PR I would be happy to review it.

mtth avatar Aug 08 '19 02:08 mtth