deeplake
deeplake copied to clipboard
[FEATURE] Skip already transferred files when restarting with hub.copy
🚨🚨 Feature Request
- Related to #1609 (closed)
Is your feature request related to a problem?
When hub.copy throws an exception the transfer is terminated. Rerunning prompts to use the overwrite=True flag which force redownloads all of the data, ignoring the chunks that have already been downloaded.
# Fails for some reason (QuotaExceeded / dropped connection / ect)
hub.copy('hub://activeloop/imagenet-val', 's3://foobar/imagenet-val', dest_creds={ ... })
# Restart prompts to run with overwrite=True
hub.copy('hub://activeloop/imagenet-val', 's3://foobar/imagenet-val', dest_creds={ ... }, overwrite=True)
Description of the possible solution
When a transfer is restarted, validate existing chunks that have been transferred and skip them if they are complete.
really appreciate this, @JossWhittle . let us know if you have any other feature suggestions/feedback in the meantime!
No worries, sorry I missed a title.