Issue 35041: urllib.parse.quote safe Parameter Not Optional (original) (raw)

Created on 2018-10-22 02:25 by william.ayd, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (8)
msg328229 - (view)	Author: (william.ayd) *	Date: 2018-10-22 02:25
The safe parameter in urllib.parse.quote is documented as optional. However, the following will raise TypeError: 'NoneType' object is not iterable: urllib.parse.quote("/", safe=None) whereas explicitly providing an iterable will allow the function to succeed: urllib.parse.quote("/", safe=[])
msg328230 - (view)	Author: Ammar Askar (ammar2) *	Date: 2018-10-22 02:37
The documentation also goes on to say that its default value is '/'. It is optional in the sense that the function can be called without providing it. It doesn't mean that `None` is somehow a valid type for it. Consider the open function for example: https://docs.python.org/3/library/functions.html#open The documentation says that, "mode is an optional string that specifies the mode in which the file is opened. It defaults to 'r' " This doesn't mean you can just go off and do `open("file.txt", None)`
msg328231 - (view)	Author: (william.ayd) *	Date: 2018-10-22 02:57
Semantics aside is it still the intended behavior that these calls should work: urllib.parse.quote("/", safe='') AND urllib.parse.quote("/", safe=[]) But that this should raise? urllib.parse.quote("/", safe=None) IMO seems counterintuitive
msg328232 - (view)	Author: Ammar Askar (ammar2) *	Date: 2018-10-22 03:11
I would say so since the documentation says: safe parameter specifies additional ASCII characters None isn't really a set of ASCII characters but the empty string and empty list are.
msg328233 - (view)	Author: Ammar Askar (ammar2) *	Date: 2018-10-22 03:21
Does rewording that prior part to "safe parameter specifies an iterable of additional ASCII characters" make it less confusing?
msg328234 - (view)	Author: (william.ayd) *	Date: 2018-10-22 03:28
Hmm well I still personally feel that the implementation is somewhat off mores than the documentation. Specifically I think it is confusing that it accepts an empty iterable but not one containing elements. This is fine: urllib.parse.quote("/", safe=[]) Though this isn't: urllib.parse.quote("/", safe=['/']) Even though the following two calls are fine (though with different return values as expected): urllib.parse.quote("/", safe='') urllib.parse.quote("/", safe='/') It might go against the spirit of duck typing but I find it very nuanced that empty iterables are allowed but if non-empty it must be a string. Would it not make more sense to raise if a non-String type is passed?
msg328235 - (view)	Author: Ammar Askar (ammar2) *	Date: 2018-10-22 03:57
I agree that urllib.parse.quote("/", safe=['/']) should probably work. It looks like the reason it doesn't is because quote tries to normalize the safe iterable to ASCII characters only. As part of this it attempts to run `< 128` on each element of safe. As part of this it does '/' < 128 and fails.
msg328236 - (view)	Author: (william.ayd) *	Date: 2018-10-22 04:25
What if we instead just raised for anything that isn't a string or a byte? The docstring for quote suggests that it should only accept str or byte objects for safe, though it doesn't enforce that: https://github.com/python/cpython/blob/121eb1694cab14df857ba6abe9839654cada15cf/Lib/urllib/parse.py#L791 Seems like it would be easier to enforce that rather than trying to accept any arbitrary iterable.

History
Date	User	Action	Args
2022-04-11 14:59:07	admin	set	github: 79222
2018-10-22 04:25:23	william.ayd	set	messages: +
2018-10-22 03:57:54	ammar2	set	messages: +
2018-10-22 03:28:40	william.ayd	set	messages: +
2018-10-22 03:21:22	ammar2	set	messages: +
2018-10-22 03:11:36	ammar2	set	messages: +
2018-10-22 02:57:35	william.ayd	set	messages: +
2018-10-22 02:37:11	ammar2	set	status: open -> closednosy: + ammar2messages: + resolution: not a bugstage: resolved
2018-10-22 02:25:11	william.ayd	create