[Python-Dev] Pathlib enhancements - acceptable inputs and outputs for fspath and os.fspath() (original) (raw)

Ethan Furman ethan at stoneleaf.us
Sat Apr 9 12:41:01 EDT 2016


On 04/09/2016 12:48 AM, Nick Coghlan wrote:

Considering the helper function usage, here's some examples in combination with os.fsencode and os.fsdecode:

Status quo for binary/text path conversions

text_path = os.fsdecode(bytes_path) bytes_path = os.fsencode(text_path)

Getting a text path from an arbitrary object

text_path = os.fspath(obj) # This doesn't scream "returns text!" text_path = os.fspathname(obj) # This does

Getting a binary path from an arbitrary object

bytes_path = os.fsencode(os.fspath(obj)) bytes_path = os.fsencode(os.fspathname(obj))

I'm starting to think the semantic nudge from the "name" suffix when reading the code is worth the extra four characters when writing it (keeping in mind that the whole point of this exercise is that most folks won't be writing explicit conversions - the stdlib will handle it on their behalf).

I also think the more explicit name helps answer some of the type signature questions that have arisen:

  1. Does os.fspathname return rich Path objects? No, it returns names as str objects
  2. Will file descriptors pass through os.fspathname? No, as they're not names, they're numeric descriptors.
  3. Will bytes-like objects pass through os.fspathname? No, as they're not names, they're encodings of names

This worries me.

I know the primary purpose of this change is to enable pathlib and os and the rest of the stdlib to work together, but consider . . .

If adding a new attribute/method was as far as we went, new code (stdlib or otherwise) would look like:

if isinstance(a_path_thingy, bytes): # because os can accept bytes pass elif isinstance(a_path_thingy, str): # but it's usually text pass elif hasattr(a_path_thingy, 'fspath'): a_path_thingy = a_path_thingy.fspath() else: raise TypeError('not a valid path')

do something with the path

If we add os.fspath(), but don't allow bytes to be returned from it, our above example looks more like:

if isinstance(a_path_thingy, bytes): # because os can accept bytes pass else: a_path_thingy = os.fspath(a_path_thingy)

do something with the path

Yes, it's better -- but it still requires a pre-check before calling os.fspath().

It is my contention that this is better:

a_path_thingy = os.fspath(a_path_thingy)

This raises two issues:

  1. Part of the stdlib is the new scandir module, which can work with, and return, both bytes and text -- if fspath can only hold text, DirEntry will not get the fspath method added, and the pre-check, boiler-plate code will flourish;

  2. pathlib.Path accepts bytes -- so what happens when a byte-derived Path is passed to os.fspath()? Is a TypeError raised? Do we guess and auto-convert with fsdecode()?

I think the best answer is to

-- Ethan



More information about the Python-Dev mailing list