s3fs icon indicating copy to clipboard operation
s3fs copied to clipboard

Question about _put_file/upload

Open troychiu opened this issue 2 years ago • 2 comments

Hi Community, I have questions about _put_file. In the multipart upload of _put_file function, I noticed that we are sending a request at a time instead of sending all request at a single time and join them. The latter seems to be faster in my opinion. (I think the idea can also be applied to _get_file/download) Is there any reason why we do this? Would be happy to discuss and contribute, thank you!

troychiu avatar Sep 21 '23 18:09 troychiu

It would be reasonable to write multiple pieces asynchronously. However, the pieces are necessarily large (otherwise a single call is enough), so I very much doubt there would be any time to be saved. The async route will require much more memory, since it would load many blocks of data at once.

Would you mind doing a speed-test to prove that async operation is clearly better?

martindurant avatar Sep 21 '23 18:09 martindurant

Yes I agree with your point. Only when the pieces are large enough, we will get performance improve. Also, async route will require more memory at a single time. I can try to implement it and test the performance and we can have further discussion :) Btw, what would be the typical way to test the upload/download speed? Should I directly use a S3 to test or there are other better ways?

troychiu avatar Sep 21 '23 18:09 troychiu