cloudstorage
cloudstorage copied to clipboard
Utility function for parsing uri into container, path
I'm looking to gauge whether adding a utility function would be a useful or welcome addition to this library. In order to use cloudstorage in some of our projects we've added a helper function to parse s3 and wasb urls, for example, into container, path for use in bits of code like:
container = storage.get_container(parsed_container)
container.upload_blob(f, blob_name=parsed_path)
If there's any interest, I'd be happy to help with implementing and expanding a helper function like below:
def parse_blob_path(blob_path: str) -> Tuple[str, str, str]:
"""Splits a blob URL into a `(protocol, container, path)` tuple."""
if not blob_path:
raise CloudStorageError('blob_path cannot be empty')
scheme, netloc, path, _, _ = urlsplit(blob_path)
path_without_leading_slash = path[1:]
if scheme in ('s3', 's3a', 's3n', 'gs'):
container = netloc
elif scheme in ('wasb', 'wasbs'):
container, storage_account = netloc.split('@')
else:
raise CloudStorageError(f'Unknown scheme {scheme}; which cloud provider is this?')
return scheme, container, path_without_leading_slash
Thoughts?
Go ahead and implement it since it sounds like it would definitely benefit other users.