Is `isUnescapedInURI` written incorrectly?
isUnescapedInURI documentation says:
Returns True if the character is allowed unescaped in a URI.
However, its implementation is:
isUnescapedInURI c = isReserved c || isUnreserved c
where isReserved documentation says:
Returns True if the character is a "reserved" character in a URI. To include a literal instance of one of these characters in a component of a URI, it must be escaped.
So, it seems to me that if isReserved returns True, then isUnescapedInURI ought to return False.
This is, essentially, a documentation bug.
The isUnescapedInURI function is answering the question, "Can this appear at all in a URI?" The docstring for it gives an example URI containing non-ASCII characters (with umlauts and such), and those absolutely have to be escaped before they can be included.
The reserved characters like ? are allowed to appear in a finished URI, so both functions return True. But the docstring for isReserved is trying to put you on the right path: If you're forming a URI out of parts, and one of those parts contains a reserved character, you'd better escape it.
In fact, the companion function for isUnescapedInURI, namely isUnescapedInURIComponent is going to be the more useful one: If you are forming a URI out of parts and including arbitrary strings, you should use that one to escape the parts. In fact, I'm not sure what you would use isUnescapedInURI for.
I'll have a go at improving the docstrings for the isUnescaped functions.
Well, after playing it with it for a bit more, I realized isUnescapedInURIComponent is rarely what you want, either. It will encode, say, a slash character, which is rarely what you want when forming a path, say:
>>> URI {
>>> uriScheme = "http:",
>>> uriAuthority = Nothing,
>>> uriPath = escapeURIString isUnescapedInURIComponent "/foo/b?ar/baz", -- you want the question mark escaped
>>> uriQuery = "",
>>> uriFragment = ""
>>> }
http:%2Ffoo%2Fb%3Far%2Fbaz
The result escapes the question mark as desired, but also the slashes, which would not mess up the parsing at that point and you'd usually keep them unescaped.
I will still try to improve the documentation, although I'll be doing some gymnastics to try to make either of these functions sound useful...