Local file is lost even if the download fails

Open kucukaslan opened this issue 3 years ago • 1 comments

Issue & steps to reproduce

$ echo "Example text." > tmp
$ cat tmp
Example text.
$ s5cmd cp s3://nonexistentbucket/key tmp
ERROR "cp s3://nonexistentbucket/key tmp": InvalidAccessKeyId: The AWS Access Key Id you provided does not exist in our records. status code: 403, request id: PAYWDH7HHVPAHC8G, host id: cnI/cBhYZKKJ159f8mMs4KHOn1+jYtvfpYnXgMuKMRd9pZ10FBdi/cGuVxyr+iwrKbk2kP7Opx4=
$ cat tmp
cat: tmp: No such file or directory

We would expect the local file to remain untouched since the download operation did not even started.

Proposed Solution

Instead of creating the file before the download request a custom WriterAt, which will create the file once the first write request came, can be used. Following snippets are going to change: https://github.com/peak/s5cmd/blob/123b1c7fc9c614aa214a6795468fa140a38ad05e/command/cp.go#L511-L517 https://github.com/peak/s5cmd/blob/0431f503d99953e1809bf0e86d73750c0c1f561e/storage/fs.go#L208-L214

Aug 01 '22 15:08 kucukaslan

Ran into a similar issue in https://github.com/peak/s5cmd/issues/493. My concern with the implementation you mentioned is that if the download itself fails (because of a network segmentation, etc.) the local file content would still be clobbered. The AWS CLI (among others) gets around this by writing to a temporary file first, and doing an (atomic) file rename operation once the download is complete.

Aug 19 '22 21:08 Chris-Dee