Issue 12910: urllib.quote quotes too many chars, e.g., '()' (original) (raw)

Issue 16285 updated the urllib.parse.quote() reserved list to add '~'.

From the docstring: def quote(string, safe='/', encoding=None, errors=None): """quote('abc def') -> 'abc%20def'

Each part of a URL, e.g. the path info, the query, etc., has a
different set of reserved characters that must be quoted.

RFC 3986 Uniform Resource Identifiers (URI): Generic Syntax lists
the following reserved characters.

reserved    = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
              "$" | "," | "~"

Each of these characters is reserved in some component of a URL,
but not necessarily in all of them.

Python 3.7 updates from using RFC 2396 to RFC 3986 to quote URL strings.
Now, "~" is included in the set of reserved characters.

However, looking at RFC3986 (https://tools.ietf.org/html/rfc3986), appendix A has the following:

unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" reserved = gen-delims / sub-delims gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@" sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="


Should the missing ones be added or should this issue be closed if they aren't going to be added?

Thanks.