[Python-Dev] Pathlib enhancements - acceptable inputs and outputs for fspath and os.fspath() (original) (raw)

Paul Moore p.f.moore at gmail.com
Thu Apr 14 06:07:49 EDT 2016


On 14 April 2016 at 08:02, Stephen J. Turnbull <stephen at xemacs.org> wrote:

So let me propose what I think is the elephant in the room. If you're going to have a polymorphic fspath, then pathlib is the example of a module that desperately needs to be polymorphic. Consider:

A non-text Application has some bytes and passes them to pathlib.Path as manipulates them and passes the result to os.scandir as expecting a return of DirEntries of == == bytes, and == Path is TOOWTDI, no?

I'm not sure I follow this logic at all. But from my reading your argument contradicts your conclusion, so maybe I'm misunderstanding.

To me, the "obvious" conclusion is that pathlib is not appropriate in non-text applications, because cannot be bytes (the constructor rejects bytes). I see no reason to change that - non-text applications are inherently low level, and shouldn't expect to use high-level abstractions like pathlib.

But under the current proposal which doesn't touch the internal mechanisms of pathlib and allows, but has no way to request, bytes returns, == str, == Path, and == str, requiring two explicit conversions that bytes-shoveling developers will tell you should be unnecessary. QED, pathlib should be polymorphic as a central part of this proposal.

Nope, QED pathlib is not a low level abstraction.

So your argument to me doesn't help much, because it's a given that pathlib is str-only. The debate is about how things like scandir (specifically DirEntry objects) and Ethan's pathlib replacement, which do allow bytes in and out, should participate in the new protocol, when they are bytes (they obviously should work just like pathlib when they are strings).

In my opinion, they shouldn't the new protocol should be string-only (at least initially).

If I understand (from a couple of brief mentions) Ethan has a string-like path object and a bytes-like path object, so he could support fspath on the string-like one but not the bytes-like one. He may not like having slightly different APIs for the two types, I don't know, but it's possible. But DirEntry is polymorphic, so it will have a fspath method, and needs to know what to do when it's bytes-like (I guess with a bit of getattr hacking DirEntry could expose a fspath method only if it's string-like, but that seems like a pretty gross hack).

So:

  1. pathlib remains string-like, and is the canonical example of fspath, returns strings only
  2. DirEntry is the only other example of the protocol in the stdlib, but is polymorphic
  3. I'm not aware of any 3rd party library that has polymorphic classes (Ethan can correct me if I'm wrong here)

So the only purpose I know of for discussing fspath returning bytes is for scandir, and hypothetical polymorphic 3rd party path abstractions (and possibly Ethan's preference to have a common API for his 2 classes).

I propose we should have a string-only fspath protocol in 3.6. Bytes-format DirEntry objects can raise an error in fspath. If it becomes obvious with usage that we need bytes support in fspath we can add it (compatibly - string-only code wouldn't need to change) in 3.7. That seems far better to me than trying to design bytes support without actual use cases.

Paul



More information about the Python-Dev mailing list