Issue 33342: urllib IPv6 parsing fails with special characters in passwords (original) (raw)
Issue33342
Created on 2018-04-23 13:44 by benaryorg, last changed 2022-04-11 14:58 by admin.
Messages (7) | ||
---|---|---|
msg315668 - (view) | Author: benaryorg (benaryorg) | Date: 2018-04-23 13:44 |
The documentation specifies to follow RFC 2396 (https://tools.ietf.org/html/rfc2396.html) but fails to parse a user:password@host url in urllib.parse.urlsplit (https://docs.python.org/3/library/urllib.parse.html#urllib.parse.urlsplit) when the password contains an '[' character. This is because the urlsplit code does not strip the authority part (everything from index 0 up to and including the last '@') before checking whether the hostname contains '[' for detecting whether it's an IPv6 address (https://github.com/python/cpython/blob/8a6f4b4bba950fb8eead1b176c58202d773f2f70/Lib/urllib/parse.py#L416-L418). | ||
msg317119 - (view) | Author: Martin Panter (martin.panter) * ![]() |
Date: 2018-05-19 13:49 |
I presume this is about parsing a URL like >>> urlsplit("//user:[@host") Traceback (most recent call last): File "", line 1, in File "/home/proj/python/cpython/Lib/urllib/parse.py", line 431, in urlsplit raise ValueError("Invalid IPv6 URL") ValueError: Invalid IPv6 URL Ideally the square bracket should be escaped as %5B. Related reports about parsing unescaped delimiters in a URL password are Issue 18140 (fragment #, query ?) and Issue 23328 (slash /). | ||
msg327239 - (view) | Author: Thomas Jollans (tjollans) | Date: 2018-10-06 09:43 |
RFC 2396 explicitly excludes the use of [ and ] in URLs. RFC 2732 <https://www.ietf.org/rfc/rfc2732.txt> defines the syntax for IPv6 URLs, and allows [ and ] ONLY in the host part. So I'd say that the behaviour is arguably correct (if somewhat unfortunate) | ||
msg334273 - (view) | Author: Terrence Brannon (metaperl) | Date: 2019-01-23 21:37 |
I would like to add to this bug - the password field on the URL cannot contain a pound sign or question mark or the parser incorrectly parses the URL, as this gist demonstrates - https://gist.github.com/metaperl/fc6f43bf6b9a9f874b8f27e29695e68c | ||
msg334302 - (view) | Author: Terrence Brannon (metaperl) | Date: 2019-01-24 15:55 |
Also note, if SQLAlchemy gives any guidance, then note that SA unquotes both the username and password of the URL: https://github.com/sqlalchemy/sqlalchemy/blob/master/lib/sqlalchemy/engine/url.py#L274 | ||
msg334303 - (view) | Author: Terrence Brannon (metaperl) | Date: 2019-01-24 15:59 |
Regarding "RFC 2396 explicitly excludes the use of [ and ] in URLs. RFC 2732 <https://www.ietf.org/rfc/rfc2732.txt> defines the syntax for IPv6 URLs, and allows [ and ] ONLY in the host part. So I'd say that the behaviour is arguably correct (if somewhat unfortunate)" I would say that a square bracket CAN be used in the password, but that it should be urlencoded and that this library should perform a urldecode for both username and password, just as SQLAlchemy does. | ||
msg354745 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2019-10-15 17:08 |
I modified my PR 16780 to also fix this issue, my PR was written for bpo-36338. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:58:59 | admin | set | github: 77523 |
2019-10-15 17:08:55 | vstinner | set | messages: + |
2019-10-15 16:24:08 | xtreak | set | nosy: + vstinner |
2019-01-24 15:59:49 | metaperl | set | messages: + |
2019-01-24 15:55:55 | metaperl | set | messages: + |
2019-01-23 21:37:03 | metaperl | set | nosy: + metaperlmessages: + |
2018-10-06 09:43:45 | tjollans | set | nosy: + tjollansmessages: + |
2018-05-19 13:49:36 | martin.panter | set | nosy: + martin.pantermessages: + |
2018-04-23 13:44:45 | benaryorg | set | type: behavior |
2018-04-23 13:44:30 | benaryorg | create |