cloudpathlib icon indicating copy to clipboard operation
cloudpathlib copied to clipboard

Add `as_url` method to each CloudPath implementation with ability to generate presigned_urls

Open kabirkhan opened this issue 3 years ago • 1 comments

Overview

Adds an as_url method that returns the https:// address of the object or generates a presigned URL if presign=True is passed.

Based on https://github.com/drivendataorg/cloudpathlib/issues/235

Usage

from cloudpathlib import CloudPath

path = CloudPath("az://tests/test.txt")

print(path.as_url())
# https://myaccount.blob.core.windows.net/tests/text.txt

print(path.as_url(presign=True))
# https://myaccount.blob.core.windows.net/tests/text.txt?se=...[OTHER_SAS_TOKEN_PARAMS]'

print(path.as_url(presign=True, expire_seconds=10))
# URL will now expire in 10 seconds
# https://myaccount.blob.core.windows.net/tests/text.txt?se=...[OTHER_SAS_TOKEN_PARAMS]'

Open Questions

1. What should we call this method? as_url would be fine but there's already the Pathlib as_uri method so it's probably confusing.

Maybe we make as_uri return the default https:// address for the blob instead of the str of the Path. Then we have this method for generating presigned urls be more specific (e.g. get_signed_url) So for the above example:

from cloudpathlib import CloudPath

path = CloudPath("az://tests/test.txt")

print(path.as_uri())
# https://myaccount.blob.core.windows.net/tests/text.txt
# 
# Note: Currently this would return the str of Path so:
# "az://tests/text.txt"

print(path.get_signed_url())
# https://myaccount.blob.core.windows.net/tests/text.txt?se=...[OTHER_SAS_TOKEN_PARAMS]'

kabirkhan avatar Jun 03 '22 00:06 kabirkhan

@kabirkhan Thanks for this, I think the implementation is looking great. A few questions:

(1) Do you know if any of the provider specific SDKs support getting the URL? Might be more reliable than the f-string version, even in the non-presigned case (2) The S3 will need to handle if a custom endpoint url is set (e.g., on a MinIO server) (3) I'm partial to using as_url and having that support the presigning, and also leaving as_uri alone. I do definitely see the point you brought up that it is potentially confusing, but IMO at least for S3 there's enough treatment of the s3://... form as a URI that having both options may provide utility (even if you can get the as_uri with str already).

Other than those qs, this will need:

  • [ ] tests
  • [ ] documentation updates

Also, do you have creds that you can use to test the various live versions of the services? If not, I can share some with you.

pjbull avatar Jun 17 '22 22:06 pjbull

@kabirkhan I know this is a little old, but I'm pushing to get some valuable stale PRs in if possible. Would you have time to wrap the implementation here with the items above?

pjbull avatar Jul 22 '23 17:07 pjbull

Hey @pjbull, I found some time! lol sorry for the super late reply here. I ended up not needing this for my work but it popped in my head that I never finished this and it's legitimately a useful addition.

Finished up the main code changes and this now uses the SDK of each provider to get the public URLs as well so local emulators should work.

Started working on the tests as well but not all the there yet.

kabirkhan avatar Jan 31 '24 21:01 kabirkhan

Awesome, thanks @kabirkhan! I actually had just had to reach into the S3Client and do this with the boto client directly, so excited to get this in.

I'll take a look this weekend.

pjbull avatar Feb 02 '24 23:02 pjbull

Codecov Report

Attention: 6 lines in your changes are missing coverage. Please review.

Comparison is base (991704a) 94.3% compared to head (5867217) 93.9%. Report is 2 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff            @@
##           master    #236     +/-   ##
========================================
- Coverage    94.3%   93.9%   -0.5%     
========================================
  Files          22      22             
  Lines        1560    1615     +55     
========================================
+ Hits         1472    1517     +45     
- Misses         88      98     +10     
Files Coverage Δ
cloudpathlib/azure/azblobclient.py 94.2% <100.0%> (+0.3%) :arrow_up:
cloudpathlib/azure/azblobpath.py 93.3% <100.0%> (+0.6%) :arrow_up:
cloudpathlib/gs/gsclient.py 93.5% <100.0%> (-1.1%) :arrow_down:
cloudpathlib/gs/gspath.py 94.6% <100.0%> (+0.5%) :arrow_up:
cloudpathlib/s3/s3client.py 92.9% <100.0%> (-1.1%) :arrow_down:
cloudpathlib/s3/s3path.py 98.1% <100.0%> (+0.1%) :arrow_up:
cloudpathlib/cloudpath.py 93.3% <66.6%> (-0.2%) :arrow_down:
cloudpathlib/local/localpath.py 90.0% <50.0%> (-4.5%) :arrow_down:
cloudpathlib/client.py 88.1% <66.6%> (-1.5%) :arrow_down:
cloudpathlib/local/localclient.py 94.6% <50.0%> (-2.1%) :arrow_down:

codecov[bot] avatar Feb 15 '24 18:02 codecov[bot]

Thanks @kabirkhan! Moving to local branch to run the live backend tests

pjbull avatar Feb 17 '24 17:02 pjbull