[Python-Dev] Pathlib enhancements - acceptable inputs and outputs for fspath and os.fspath() (original) (raw)
Ethan Furman ethan at stoneleaf.us
Sat Apr 9 12:41:01 EDT 2016
- Previous message (by thread): [Python-Dev] Question about the current implementation of str
- Next message (by thread): [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 04/09/2016 12:48 AM, Nick Coghlan wrote:
Considering the helper function usage, here's some examples in combination with os.fsencode and os.fsdecode:
Status quo for binary/text path conversions
text_path = os.fsdecode(bytes_path) bytes_path = os.fsencode(text_path)
Getting a text path from an arbitrary object
text_path = os.fspath(obj) # This doesn't scream "returns text!" text_path = os.fspathname(obj) # This does
Getting a binary path from an arbitrary object
bytes_path = os.fsencode(os.fspath(obj)) bytes_path = os.fsencode(os.fspathname(obj))
I'm starting to think the semantic nudge from the "name" suffix when reading the code is worth the extra four characters when writing it (keeping in mind that the whole point of this exercise is that most folks won't be writing explicit conversions - the stdlib will handle it on their behalf).
I also think the more explicit name helps answer some of the type signature questions that have arisen:
- Does os.fspathname return rich Path objects? No, it returns names as str objects
- Will file descriptors pass through os.fspathname? No, as they're not names, they're numeric descriptors.
- Will bytes-like objects pass through os.fspathname? No, as they're not names, they're encodings of names
This worries me.
I know the primary purpose of this change is to enable pathlib and os and the rest of the stdlib to work together, but consider . . .
If adding a new attribute/method was as far as we went, new code (stdlib or otherwise) would look like:
if isinstance(a_path_thingy, bytes): # because os can accept bytes pass elif isinstance(a_path_thingy, str): # but it's usually text pass elif hasattr(a_path_thingy, 'fspath'): a_path_thingy = a_path_thingy.fspath() else: raise TypeError('not a valid path')
do something with the path
If we add os.fspath(), but don't allow bytes to be returned from it, our above example looks more like:
if isinstance(a_path_thingy, bytes): # because os can accept bytes pass else: a_path_thingy = os.fspath(a_path_thingy)
do something with the path
Yes, it's better -- but it still requires a pre-check before calling os.fspath().
It is my contention that this is better:
a_path_thingy = os.fspath(a_path_thingy)
This raises two issues:
Part of the stdlib is the new scandir module, which can work with, and return, both bytes and text -- if fspath can only hold text, DirEntry will not get the fspath method added, and the pre-check, boiler-plate code will flourish;
pathlib.Path accepts bytes -- so what happens when a byte-derived Path is passed to os.fspath()? Is a TypeError raised? Do we guess and auto-convert with fsdecode()?
I think the best answer is to
- let fspath hold bytes as well as text
- let fspath() return bytes as well as text
--
Ethan
- Previous message (by thread): [Python-Dev] Question about the current implementation of str
- Next message (by thread): [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]