httr icon indicating copy to clipboard operation
httr copied to clipboard

httr::parse_url not parsing AWS s3 uri correctly

Open DyfanJones opened this issue 5 years ago • 0 comments

Hi all,

When working with AWS S3 uri format I have noticed that httr::parse_url isn't parsing it as expected. A standard s3 uri follows the following format: s3://mybucket/path/to/file. When parsed I expect it to be:

# $scheme
# s3
# 
# $hostname
# mybucket
# 
# $hostname
# NULL
#
# $path
# path/to/file
# 

However I get the following

s3_uri = "s3://mybucket/path/to/file"

(parsed1 = httr::parse_url(s3_uri))
# $scheme
# NULL
# 
# $hostname
# NULL
# 
# $port
# NULL
# 
# $path
# [1] "s3://mybucket/path/to/file"
# 
# $query
# NULL
# 
# $params
# NULL
# 
# $fragment
# NULL
# 
# $username
# NULL
# 
# $password
# NULL
# 
# attr(,"class")
# [1] "url"

When using the urltools::url_parse

parsed2 = urltools::url_parse(s3_uri)
# scheme   domain   port         path          parameter fragment
# s3       mybucket <NA>         path/to/file      <NA>     <NA>

Similar with python's urllib:

import urllib

s3_uri = "s3://mybucket/path/to/file"

urllib.parse.urlparse(s3_uri)
>>> ParseResult(scheme='s3', netloc='mybucket', path='/path/to/file', params='', query='', fragment='')

DyfanJones avatar Jul 08 '20 08:07 DyfanJones