[Python-Dev] Pathlib enhancements - acceptable inputs and outputs for fspath and os.fspath() (original) (raw)

Koos Zevenhoven k7hoven at gmail.com
Wed Apr 20 04:20:10 EDT 2016


On Wed, Apr 20, 2016 at 6:11 AM, Stephen J. Turnbull <stephen at xemacs.org> wrote:

Koos Zevenhoven writes: > On Tue, Apr 19, 2016 at 2:55 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote: > > > > AFAICS bytes return from fspath is just YAGNI. Show me something > > that actually wants it. > > It might be,

May I take that as meaning you just jumped to the conclusion that extending polymorphism is useful on no actual evidence of usefulness?

No you may not! YAGNI almost never means "you are never going to need it". And if you implement a feature, better implement it well. If a variation of the feature is rarely used, that is perfectly fine. I think leaving bytes out would complicate things. If os.fspath does its job well, everyone should be happy.

I kept bringing up bytes paths, because that is already a feature in Python 3. Then (already some time ago in these discussions) I briefly visited the thought of 'can we deprecate bytes paths', and it then quickly became clear to me that is not going to happen any time soon.

In other words: As long as bytes paths are supported, they should be supported consistently. I don't want DirEntry to behave differently when the underlying type is bytes, which is one of the things I've been talking about all the time. That would just be broken. And as you also understand, one point is to allow passing DirEntry to open. Or any of the os.path functions.

An some more: I don't want open(direntry_obj) to ever raise because it is the bytes flavor of direntry, because, when they are created, DirEntry objects always point to existing objects on the file system. I also don't want implicit conversions between str and bytes paths, because there are cases where they will produce strange results and exceptions. [Yes, way back in the p-string thread, I did first suggest a similiar thing that implied implicit conversion, but I soon abandoned that part.]

Not that I will ever use these features---just to do this right.

> but as long as bytes paths are supported polymorphicly all over the > stdlib, we won't get rid of supporting bytes paths. So are you > proposing to deprecate bytes paths?

You claim "almost always want str", Ethan claims "bias against bytes." Sorry, guys, you can't have it both ways. Either bytes paths are discouraged (not "deprecated", not yet), or they aren't. I say, let's not encourage them.

It's all essentially the same thing:

"almost always want str": Yes, I still claim this. This is the reason for str (and rejecting bytes) being the default for third-party code. If we wanted to, we could even leave bytes support out of the documentation, so no-one will know about it unless they already deal with bytes paths. However, I dont think we should do that---we should just strongly discourage using the bytes version unless there is a reason to, and you know what you are doing.

"bias against bytes": I agree with this too. This is in line with making str (and rejecting bytes) the default for third-party code.

"let's not encourage them": And I even agree with this, as you may have noticed.

I just don't believe in deliberately making implementations awkward for the bytes-based paths. Bytes paths already exist, not because of Python 2 (as you know), but because not all operating systems guarantee that paths make sense in any encoding, and people may need to work at that level.

There is no need to make working with bytes-based paths awkward, and we can support them with little additional work compared to supporting str-based rich path objects. The additional work is mostly this discussion.

Ie, keep the status quo for bytes, and make things better for the preferred str. Yes, that means discouraging bytes relative to str in this context. That's a Python 3 principle, one strong enough to justify the huge compatibility break involved in making str be Unicode. That compatibility break has been extremely successful in my personal experience as a sometime Python teacher and Mailman developer, though the Mercurial developers have a different POV.

Yes. Luckily, people are already using str-based paths. We don't need any more discrete transitions. If linux will start to enforce an encoding, as Guido and Random832 may be suggesting on python-ideas, these already obscure bytes paths will slowly fade away.

-Koos



More information about the Python-Dev mailing list