httr
httr copied to clipboard
httr::parse_url not parsing AWS s3 uri correctly
Hi all,
When working with AWS S3 uri format I have noticed that httr::parse_url isn't parsing it as expected. A standard s3 uri follows the following format: s3://mybucket/path/to/file. When parsed I expect it to be:
# $scheme
# s3
#
# $hostname
# mybucket
#
# $hostname
# NULL
#
# $path
# path/to/file
#
However I get the following
s3_uri = "s3://mybucket/path/to/file"
(parsed1 = httr::parse_url(s3_uri))
# $scheme
# NULL
#
# $hostname
# NULL
#
# $port
# NULL
#
# $path
# [1] "s3://mybucket/path/to/file"
#
# $query
# NULL
#
# $params
# NULL
#
# $fragment
# NULL
#
# $username
# NULL
#
# $password
# NULL
#
# attr(,"class")
# [1] "url"
When using the urltools::url_parse
parsed2 = urltools::url_parse(s3_uri)
# scheme domain port path parameter fragment
# s3 mybucket <NA> path/to/file <NA> <NA>
Similar with python's urllib:
import urllib
s3_uri = "s3://mybucket/path/to/file"
urllib.parse.urlparse(s3_uri)
>>> ParseResult(scheme='s3', netloc='mybucket', path='/path/to/file', params='', query='', fragment='')