[Python-Dev] Pathlib enhancements - acceptable inputs and outputs for fspath and os.fspath() (original) (raw)

Nick Coghlan ncoghlan at gmail.com
Wed Apr 13 22:49:09 EDT 2016


On 14 April 2016 at 07:37, Victor Stinner <victor.stinner at gmail.com> wrote:

Le mercredi 13 avril 2016, Brett Cannon <brett at python.org> a écrit :

All of this is demonstrated in https://gist.github.com/brettcannon/b3719f54715787d54a206bc011869aa1 by the various possibilities. In the end it's not a corner case because the definition of fspath will be such that there's no ambiguity in what os.fspath() will accept and what fspath can return and the code will be written to conform to what the PEP dictates (IOW I'm aware that this needs to be considered in the implementation :) . I'm not a big fan of a flag parameter to change the return type of a function. Usually, two functions are preferred. In the os module we have getcwd/getcwdb for example. I don't know if it's a good example

It is, as one of the benefits of the "two separate functions" model is to improve type inference during static analysis - you don't necessarily know the values of parameters at analysis time, but you do know which function is being called.

Do you know other examples of Python functions taking a (flag) parameter to change the result type?

subprocess.Popen has a couple of flags that can do that (more precisely, they change the return type of some methods on the resulting object), but that's not an especially pretty API in general. String based type variations are more common (e.g. file mode flags, using the codec module registry), but they're still used only sparingly (since they make the code harder to reason about for both humans and static analysers).

In terms of types for filesystem path APIs:

  1. I assume we'll want a fast path for bytes & str to avoid performance regressions (especially in os.path, where we may be doing pure data manipulation without any IO operations)
  2. I favour defining fspath and os.fspath() in terms of what the os and os.path modules need to handle both DirEntry and pathlib (which I currently expect to be str-or-bytes)
  3. For the benefit of higher level cross-platform code like pathlib, it likely makes sense to also have a str-only API that throws an exception rather than returning bytes

However, I also suggest deferring a decision on 3 until 2 has been definitively answered by way of implementing the changes. If I'm right about 2, then the API could be something like:

It's also worth noting that os.fsencode and os.fsdecode are already idempotent - their current signatures are "str-or-bytes -> bytes" and "str-or-bytes -> str". With a str-or-bytes return type on os.fspath, adapting them to handle rich path objects should just be a matter of adding an os.fspath call as the first step.

Cheers, Nick.

-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia



More information about the Python-Dev mailing list