hyperlink
hyperlink copied to clipboard
Unable to parse `http://www.test.com/BMF%20Ver%F6ffentlichungen?`
Seems the parse function generate an error for this URL: http://www.test.com/BMF%20Ver%F6ffentlichungen?
Logs:
>>> import hyperlink
>>> hyperlink.parse("http://www.test.com/BMF%20Ver%F6ffentlichungen?")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/damien/dd2/.venv/lib/python3.9/site-packages/hyperlink/_url.py", line 2447, in parse
dec_url = DecodedURL(enc_url, lazy=lazy)
File "/home/damien/dd2/.venv/lib/python3.9/site-packages/hyperlink/_url.py", line 2046, in __init__
self.host, self.userinfo, self.path, self.query, self.fragment
File "/home/damien/dd2/.venv/lib/python3.9/site-packages/hyperlink/_url.py", line 2177, in path
[
File "/home/damien/dd2/.venv/lib/python3.9/site-packages/hyperlink/_url.py", line 2178, in <listcomp>
_percent_decode(p, raise_subencoding_exc=True)
File "/home/damien/dd2/.venv/lib/python3.9/site-packages/hyperlink/_url.py", line 766, in _percent_decode
return unquoted_bytes.decode(subencoding)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf6 in position 7: invalid start byte
FYI @37b
Hi Damien! Hyperlink by default is reporting that the %F6 in your URL is invalid text when decoded from UTF-8. We can try adding the decoded=False parameter to get a result:
>>> hyperlink.parse('http://www.test.com/BMF%20Ver%F6ffentlichungen', decoded=False)
URL.from_text('http://www.test.com/BMF%20Ver%F6ffentlichungen')
This approach gives you a URL with mostly the same interface as a DecodedURL (the default output of parse), but be aware that you may run into issues when trying to treat parts of that URL as text vs bytes. Hope this helps!
@mahmoud thanks, we are investigating if we can use the decoded flag.